Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy

Chen, Ling; Jia, Youdong; Zhang, Jianrong; Wang, Lei; Yang, Rui; Su, Yun; Li, Xinzhi

doi:10.3390/pr12091939

Open AccessArticle

Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy

by

Ling Chen

,

Youdong Jia

^*

,

Jianrong Zhang

,

Lei Wang

,

Rui Yang

,

Yun Su

and

Xinzhi Li

Faculty of Mechanical and Electrical Engineering, Kunming University, Kunming 650214, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(9), 1939; https://doi.org/10.3390/pr12091939

Submission received: 12 June 2024 / Revised: 24 August 2024 / Accepted: 5 September 2024 / Published: 10 September 2024

(This article belongs to the Section Food Process Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

The study developed a novel method for evaluating the freshness of citrus fruits by integrating near-infrared spectroscopy with the non-linear data processing capabilities of a BP neural network. This approach utilizes specific wavelength analysis to distinguish between fresh and non-fresh fruits effectively. Advanced pre-processing techniques are employed to remove spectral anomalies, enhancing the network’s ability to accurately identify crucial quality indicators like sugar content. Concurrently, an experiment utilizing a mathematical computing software -based BP neural network optimized the number of hidden layer nodes, identifying 61 as optimal. This configuration achieves impressive indicators, including a mean square error of 0.0025665 and a root mean square error of 49.8214. More than 1000 training iterations were performed on 100 citrus samples, and the learning rate was 80%. The model demonstrated a high accuracy rate of 97.6275%, confirming its precision and reliability in assessing citrus freshness. This synergy between advanced neural network processing and spectroscopic techniques marks a significant advancement in agricultural quality assessment, setting new standards for speed and efficiency in data processing.

Keywords:

citrus fruit freshness assessment; near-infrared spectroscopy; BP neural network; non-linear data processing

1. Introduction

Citrus fruits are of global importance [1,2] due to their high content of vitamin C, soluble sugars, organic acids, and flavonoids, which together contribute to the distinctive citrus flavor [3,4]. The challenge lies in the visual similarity of citrus fruits at different stages of ripeness, making it difficult for consumers to differentiate between them. This similarity in appearance can lead to potential discrepancies in quality and pricing, with instances of lower-priced citrus being deceptively marketed as premium products to generate undue profits for unscrupulous traders.

Currently, analysis of the chemical composition and internal structural details of citrus can be performed using chromatographic techniques such as high-performance liquid chromatography [HPLC] with different detectors [5,6] and gas chromatography-mass spectrometry (GC-MS) [7,8]. However, these chromatographic methods require complicated sample preparation and expensive equipment, while lacking the ability to perform non-destructive characterization.

Near-Infrared Diffuse Reflectance Spectroscopy (NIRDRS) is regarded as a non-destructive, rapid, and environmentally friendly technique. However, the complexity of sample compositions, along with surface roughness and instrumental noise, can give rise to spectral interferences, including peak overlap, baseline drift, and background noise, which complicates the analysis [9,10,11,12]. To address these issues, the use of a combination of spectral pre-treatment methods is crucial for refining data and minimizing interferences [13,14]. Additionally, the optimization algorithm of the backpropagation neural network excels in linear regression predictions and effectively differentiates between fresh and non-fresh categories by analyzing waveform characteristics. The implementation of multiple orientations for sample collection and detection, particularly in the case of oranges, significantly enhances the robustness of the experiment and enables a more comprehensive evaluation of fruit quality, which may vary with orientation. In the current global context, citrus fruits hold significant economic and health value due to their rich nutritional content. Although traditional chromatographic techniques, such as high-performance liquid chromatography (HPLC) and gas chromatography-mass spectrometry (GC-MS), are capable of providing detailed chemical composition analysis, they often involve complex sample preparation and expensive equipment and cannot perform non-destructive testing [15]. Consequently, the development of a rapid, economical, and environmentally friendly method is crucial for enhancing market regulation and consumer trust in citrus products [16].

Near-Infrared Diffuse Reflectance Spectroscopy (NIRDRS) is regarded as a valuable tool for addressing these issues, given its non-destructive, rapid, and eco-friendly characteristics. Research conducted both domestically and internationally has demonstrated that NIRDRS, when combined with spectral pre-processing methods and backpropagation neural network optimization algorithms, has been successfully applied to distinguish the freshness of citrus fruits. This evidence suggests that NIRDRS has significant potential for real-time and onsite testing [17,18,19]. Furthermore, by modifying the orientation of the sample and collecting spectral data from multiple directions, the accuracy and stability of the analysis can be significantly enhanced, effectively addressing the errors caused by uneven fruit surfaces and spectral noise [20,21]. This study’s innovation lies in the application of NIRDRS technology combined with multi-directional sampling and neural network algorithms, providing a new, rapid, and accurate method for assessing the quality and freshness of citrus fruits. The implementation of this method not only enhances consumer trust in product quality but also promotes market fairness and curbs fraudulent practices by unscrupulous traders, with significant social and economic implications [22,23]. Future research will further explore the effectiveness of this technology across different varieties and maturity stages of citrus fruits, with the aim of broadening its application in agriculture and the food industry [24,25].

In the experimental design for detecting the freshness of citrus fruits, the setup of the experiment, the pre-processing of spectral data, and the development of algorithmic models are of particular importance. This document provides a comprehensive account of each stage of the experiment, including the configuration of the equipment, the methodology for processing spectral data, and the application of neural network models. The objective of this study is to adopt a scientific and systematic approach to accurately assess the freshness of citrus fruits.

2. Materials and Methods

The study used a number of different devices, including a near-infrared spectrometer(ATP5020R micro fiber spectrometer with high sensitivity and high resolution produced by OptoSun, China.), a reference whiteboard, an iron frame, and a laptop. As the main instrument of this study, a near-infrared spectrometer can obtain high-quality spectral data by accurately controlling the parameter settings of the spectrometer, such as scanning range, resolution, and integration time, which lays a solid foundation for subsequent analysis. The reference whiteboard is used to calibrate the spectral data, which can calibrate the spectrometer before each measurement, eliminate the influence of external factors such as ambient light and instrument drift on the data, and ensure the measurement accuracy. The iron frame is used to stabilize the spectrometer and the sample, which not only ensures the stability of the spectrometer in the data acquisition process, but also realizes the standardized acquisition of spectral data by maintaining the required distance and angle. The laptop is equipped with a control software designed for spectral analysis, which can control the working state of the spectrometer in real time and receive and store spectral data. At the same time, the software also has built-in preliminary data processing modules, such as baseline correction, noise filtering, and spectral pre-processing, which effectively improves the quality and analyzability of the data.

2.1. Construction of the Experimental Platform

In order to construct a stable and easy-to-operate experimental platform, it is essential to ensure a connection between the instrument and the computer that allows for real-time data collection and processing. A logical layout of the experimental platform facilitates enhanced experimental efficiency and the repeatability of the data. Prior to the commencement of the experiment, the equipment must undergo a 20-min warm-up period to ensure optimal performance. The spectrometer’s scanning parameters, including integration time and the number of scans, are configured to ensure the collection of high-quality data. It is of the utmost importance that samples are kept clean and accurately aligned with the spectrometer during the preparation stage.

2.2. Experimental Sample Collection

This study proposes a method for evaluating the quality and freshness of citrus fruits by combining near-infrared diffuse reflectance spectroscopy with multi-directional sampling and neural network algorithms. In the preparation stage of the experiment, a hundred citrus samples were randomly collected for the study. Then, the data of 100 citrus samples were collected by spectrometer. The spectral data were preprocessed, as shown in Table 1. Ensure the versatility and feasibility of the experiment (Table 1).

Let us detail the mathematical expressions and their interpretations for both the polynomial fitting and rolling ball algorithm methods you mentioned.

2.2.1. Polynomial Fitting

In polynomial fitting, a polynomial of degree (n) is fit to spectral data to model and remove baseline trends. The general form of the polynomial fitted to the data can be expressed as:

P (x) = a_{0} + a_{1 x} + a_{2 x}^{2} + \dots + a_{n x}^{n}

(1)

where

(x)

represents the spectral data points,

P (x)

is the polynomial model, and

(a_{0}, a_{1}, \dots, a_{n})

are the coefficients determined during the fitting process.

Interpretation:

The polynomial function

P (x)

is subtracted from the original spectral data to eliminate the underlying baseline trend. This step helps in focusing on more subtle features in the spectral data by reducing systematic variations that do not relate to the changes in interest.

2.2.2. Rolling Ball Algorithm

The rolling ball algorithm simulates a virtual ball that rolls along the spectral curve. It touches the spectrum only at its lowest points and uses these points to define a new baseline. The algorithm can be visualized as a function that adjusts dynamically to the lowest contact points of a moving ball along the spectrum curve.

Mathematical Interpretation:

Let

f (x)

be the original spectral curve. The rolling ball algorithm computes a new baseline

B (x)

, where B(x) is defined by the path traced by the lowest point of the ball as it rolls along f(x).

B (x) = m i n {f (x + r \ s i n θ)∣ θ \ i n [- \frac{π}{2}, \frac{π}{2}]}

(2)

where r is the radius of the ball and

θ

varies over the angles at which the ball might contact the spectral curve.

Interpretation:

This method is effective for removing low-frequency background noise because it continuously adapts the baseline depending on the curvature and local minima of the spectral data. It is especially useful in scenarios where the noise has a slow-varying, smooth component, which can be approximated by the envelope formed by the virtual rolling ball.

These methods provide robust tools for pre-processing spectral data, ensuring that subsequent analyses are more focused on meaningful signals rather than artifacts or persistent trends.

Here are the mathematical formulas and their explanations for the Savitzky–Golay filter and Fourier transform filter.

2.3. Savitzky–Golay Filter

The Savitzky–Golay filter is used to smooth spectral data by fitting successive subsets of adjacent data points with a low-degree polynomial by the method of least squares. The formula for the Savitzky–Golay filter is given by the convolution of the signal

x (t)

with a set of polynomial coefficients (c) derived from the least squares fit:

y (t) = \sum_{i = - m}^{m} c_{i} \cdot x (t + i)

(3)

where y(t) is the smoothed value of the signal at time t, x(t + i) are the data points within the window of size 2m + 1 centered at (t),

(c_{i})

are the polynomial coefficients, which depend on the degree of the polynomial and the window size.

This filter helps maintain high-frequency features of the signal while removing random noise, ensuring that important details like peaks and troughs are preserved.

2.4. Fourier Transform Filter

The Fourier transform filter involves transforming spectral data to the frequency domain, filtering out specific frequency components, and then transforming the data back to the time domain. The general steps are as follows:

Fourier transform: Convert the time-domain signal x(t) into the frequency domain using the Fourier transform:

X (f) = \int_{- \infty}^{\infty} x (t) e^{- j 2 π f t} d t

(4)

Frequency Domain Filtering:

Apply a filter

H (f)

to the transformed data to attenuate or remove unwanted frequencies:

Y (f) = H (f) \cdot X (f)

(5)

Inverse Fourier Transform: Convert the filtered frequency domain data back to the time domain:

y (t) = \int_{- \infty}^{\infty} Y (f) e^{j 2 π f t} d f

(6)

where

X (f)

and Y(f) are the Fourier transforms of the original and filtered signals, respectively,

H (f)

is the frequency response of the filter,

(f)

represents frequency, and

(t)

represents time.

This method is effective for isolating and removing specific types of noise that have characteristic frequency distributions, thereby cleaning up the signal without significantly distorting the original data.

In spectral analysis, particularly when dealing with spectral data, two commonly used normalization methods are the Standard Normal Variable Transformation (SNV) and Vector Normalization. Both techniques aim to reduce the systematic differences between samples, making the data more comparable.

2.5. Standard Normal Variable Transformation (SNV)

Formula:

X_{i j}^{'} = \frac{X_{i j} - {\bar{X}}_{i}}{s_{i}}

(7)

where

$X_{i j}$ represents the original spectral value at the $j^{t h}$ wavelength for the $i^{t h}$ sample.
${\bar{X}}_{i}$ is the mean spectral value across all wavelengths for the ${\bar{X}}_{i}$ sample.
$s_{i}$ is the standard deviation of the spectral values for the $i^{t h}$ sample.

Explanation:

The SNV technique adjusts each spectrum individually by subtracting the mean and dividing by the standard deviation of that spectrum. This transformation standardizes each spectrum to have a mean of zero and a variance of one, effectively normalizing the data and reducing variations that are not related to compositional differences among samples.

2.6. Vector Normalization

Formula:

X_{i j}^{'} = \frac{X_{i j}}{∥ X_{i} ∥}

(8)

where

∥ X_{i} ∥

represents the norm of the spectrum for the $i^{th}$ sample, calculated typically as the Euclidean norm.

Explanation:

Vector normalization scales each spectrum, such that the length (norm) of the vector representing the spectrum is equal to one. This method ensures that all spectra are compared on the same scale, emphasizing shape differences over magnitude differences, which is particularly useful in various multi-variate analysis applications where the magnitude of the spectral vector might skew the analysis.

To address the calculation of the first and second derivatives of spectral data, here are the mathematical formulations and their detailed explanations:

2.7. First Derivative

The first derivative of a function, in this context, spectral data represented by

f (λ)

where

λ

is the wavelength, measures the rate of change in spectral intensity with respect to wavelength. Mathematically, the first derivative is given by:

f^{'} (λ) = \frac{d f (λ)}{d λ}

(9)

Purpose and Application:

Highlighting Details: The first derivative accentuates the fine details within the spectral curves. This can be particularly useful in resolving peaks that are close together (overlapping spectral peaks).

Rate of Change: It provides a clearer visualization of how the spectral intensity changes with wavelength, which is crucial for qualitative analysis.

2.8. Second Derivative

The second derivative of spectral data is the derivative of the first derivative. It measures the rate at which the rate of change in spectral intensity is changing, providing a deeper insight into the curvature of the spectral data. Mathematically, it is expressed as:

f^{″} (λ) = \frac{d^{2} f (λ)}{d λ^{2}}

(10)

Purpose and Application:

Refining Small Changes: The second derivative is particularly sensitive to small changes in the data. This sensitivity is crucial for more accurate analysis of peak shapes and positions, aiding in the identification of subtle features that might be missed by only examining the original data or its first derivative.

Enhanced Peak Resolution: By highlighting inflection points, the second derivative can more accurately determine the beginning and end of peaks, thus improving the resolution of closely spaced or overlapping peaks.

Conclusion: Using the first and second derivatives in the analysis of spectral data enables a more nuanced understanding of the data, enhancing both qualitative and quantitative analyses. These derivatives are especially valuable in complex spectral environments where simple peak assignments are challenging.

The imported data, which has been processed into a .csv file format, is then subjected to an algorithmic process in MATLAB. This produces spectral graphs of fresh and non-fresh citrus fruits. The experimental data fully reflects the internal structural information of the citrus fruit, with the wave peaks of three types of samples being close in wavelength. Although there are differences between individual spectra, the waveforms are similar, and the positions of the wave peaks are generally consistent. The sample slices provide a more comprehensive reflection of internal information. Transitional type fruits exhibit greater spectral fluctuations after picking, with their absorbance showing an initial increase followed by a decrease. In contrast, non-transitional fruits do not show significant spectral changes in the short term. The imported data has been meticulously formatted into .csv files, with a representative dataset illustrated in Figure 1 and Figure 2.

2.9. Algorithm Model Building

The BP open-source neural network algorithm has been introduced to optimize the particle swarm algorithm. The acquired .csv file is divided into training, testing, and validation sets following a ratio of 7:2:1.

n_{train} = 0.7 \times N, n_{test} = 0.2 \times N, n_{validation} = 0.1 \times N

(11)

where N is the total number of samples in the .csv file.

The split data function is employed to carry out this segmentation, leveraging the profound self-learning and adaptive capabilities inherently present in the BP neural network. During the training phase, the network autonomously discerns and internalizes “reasonable rules” governing the relationship between input and output data, embedding this acquired knowledge within the network’s weight parameters. This innate capacity for self-learning and adaptation empowers the BP neural network to adeptly manage and scrutinize spectral datasets, swiftly adapting to novel data instances and evolving scenarios.

The back-propagation algorithm, a fundamental component of the BP neural network, serves as a potent optimization mechanism.

w_{i j}^{(n e w)} = w_{i j}^{(o l d)} - η \frac{\partial E}{\partial w_{i j}}

(12)

where

w_{i j}

are the weights,

η

is the learning rate, and

\frac{\partial E}{\partial w_{i j}}

is the gradient of the error with respect to the weights.

It dynamically adjusts network parameters in response to prediction errors, aiming to continuously refine prediction accuracy. Moreover, the integration of diverse enhanced BP algorithms and advanced optimization methodologies can further enhance the network’s optimization and prediction capabilities. MATLAB serves as a platform for implementing BP neural network predictions on spectral data. In the realm of spectral data processing, neural networks initially engage in data preprocessing tasks, including noise elimination, data normalization, and spectral feature extraction, involving wavelengths and absorptions. Relevant features pertinent to predictive properties, such as peak positions and intensities, are derived from near-infrared spectral data. In the realm of spectral data processing, neural networks initially undertake data preprocessing tasks, encompassing noise elimination, data normalization, and spectral feature extraction involving wavelengths and absorptions. Relevant features pertinent to predictive properties, such as peak positions and intensities, are extracted from near-infrared spectral data. These extracted features, along with their corresponding property values, function as the training inputs for the BP neural network. Following training, the model is deployed in practical scenarios or analyses to swiftly forecast new near-infrared spectral data and extract the desired property values. Proficient in managing spectral datasets, the BP neural network showcases inherent parallel processing capabilities suitable for large-scale information analysis, thus facilitating enhanced characteristic identification and playing a crucial role in assessing citrus fruit freshness. Remarkably, even in cases of localized or partial neuron damage within the network, the overall training outcomes remain minimally affected, leading to significant reductions in experimental error rates.

δ_{k} = (y_{k} - t_{k}) ϕ^{'} (z_{k}) δ_{j} = (\sum_{k} w_{j k} δ_{k}) ϕ^{'} (z_{j})

(13)

where

δ_{k}

is the error term for the output layer,

δ_{j}

is the error term for the hidden layers,

y_{k}

is the output,

t_{k}

is the target,

ϕ^{'}

is the derivative of the activation function, and

z

are the input sums to the neurons.

The BP neural network adheres to the quintessential multi-layer feed-forward network design, comprising an input layer, multiple hidden layers (ranging from one to several layers), and an output layer. Connections between layers are fully established, with neurons within the same layer lacking inter-neuron connections. A three-layer network housing a single hidden layer can effectively approximate any non-linear function.

Stochastic Gradient Descent (SGD):

w = w - η \nabla E

(14)

Adam Optimizer:

w_{t + 1} = w_{t} - \frac{η}{\sqrt{{\hat{v}}_{t}} + ϵ} {\hat{m}}_{t}

(15)

where

{\hat{m}}_{t}

and

{\hat{v}}_{t}

are estimates of the first and second moments of the gradients, respectively, and

ϵ

is a small number to prevent division by zero.

As depicted in Figure 3, during the training phase, a single input, 61 hidden layers, and a single output are observed. Each neuron receives input signals from inter-connected neurons, with each signal transmission governed by associated weights. Neurons aggregate these signals to compute a cumulative input value, which is then compared against a threshold (emulating a threshold potential). The final output is generated by subjecting the input to an “activation function” (mimicking cellular activation), and the output then serves as input for downstream neurons in a cascading fashion. In cases where the actual output deviates from the expected output, the neural network initiates error backpropagation. This mechanism involves retroactively propagating errors through the network layers, starting from the output layer. Gradient descent optimization enables the network to adjust the weights of each layer, iteratively refining the model’s predictions by backpropagating the error signals towards the hidden and input layers.

The meticulously prepared training data must be input into MATLAB to instantiate a BP neural network. This involves employing the backpropagation algorithm, which iteratively adjusts the weights and biases within the network to minimize the loss function. Initializing the weights and thresholds for each layer requires small random values, typically ranging between −1 and 1 for weights and set at or near 0 for thresholds. The error is then propagated backward from the output layer based on the calculated error, with the weights and thresholds being updated according to the error adjustment rules. This iterative error correction process continues until it reaches the input layer. Various optimization techniques and regularization methods, including stochastic gradient descent, the Adam optimizer, and L2 regularization, can enhance the model’s performance. When evaluating the model, it should be applied to predict the test data, and the disparities between predicted and actual outcomes should be computed. The efficacy of the model can be assessed using metrics, such as mean squared error or cross-entropy. Model refinement involves adjusting parameters such as the number of layers, neuron count, learning rate, range of hidden layer nodes, and training iterations to optimize performance. Figure 4 illustrates that the model achieves peak performance during the 51st training iteration.

3. Results

In this study, we employed the MATLAB backpropagation (BP) neural network algorithm, modifying the number of nodes in the hidden layer from 2 to 61. We determined that an optimal configuration consists of 61 nodes, achieving a minimal mean squared error (MSE) of 0.0025665. Additionally, the sum of squared errors (SSE) and the mean absolute error (MAE) were calculated to be 248,217.3298 and 16.3988, respectively.

The performance indicators shown in Figure 5 include MSE of 2482.1733, root mean square error (RMSE) of 49.8214, and mean absolute percentage error (MAPE) of 2.3725%. The training scheme includes 1000 iterations of 100 citrus samples, the learning rate is 80%, and the target error threshold is 0.000005. The corresponding prediction errors are shown in Figure 6.

The algorithm’s efficacy was further demonstrated through its application in linear regression predictions on a spectral dataset, where it attained a remarkable accuracy rate of 97.6275%. The analysis covered wavelengths ranging from 600 to 243,900 nm, incorporating various preprocessing steps that ensured a smooth response curve and the effective elimination of interfering factors. This meticulous preprocessing enabled the algorithm to discern subtle variations in the freshness of citrus fruits, facilitating reliable differentiation as shown in Figure 7. The graphical representation of the input and output data is depicted in Figure 8, and the effects of data fitting are presented below.

4. Discussion

In this investigation, we utilized a MATLAB-based backpropagation (BP) neural network technique to assess the freshness of citrus fruits. The configured model, comprising 61 hidden layer nodes, achieved a high accuracy rate of 97.6275% after 1000 training iterations. This remarkable accuracy underscores the BP neural network’s robust ability to predict fruit freshness with precision. The notably low mean square error further highlights the minimal discrepancy between predicted and actual values, underscoring the model’s reliability [26]. Additionally, the root mean square error and the mean percentage error metrics reinforce the model’s efficient performance [27]. Complementing these findings, linear regression analyses of spectral data ranging from 600 to 900 nm validated the model’s effectiveness in distinguishing varying freshness levels in citrus fruits [28]. Relative to prior studies, the exceptional control of prediction errors and accuracy in this experiment can be attributed to its distinctive experimental design and sophisticated data processing techniques [29]. Nevertheless, it is essential to acknowledge that the study’s reliance on specific wavelength data might raise concerns about potential overfitting [30]. Future studies should integrate environmental factors such as temperature and humidity to augment the generalization capability and broader applicability of the model [31]. Additionally, it is imperative to broaden the cohort of study subjects and employ more varied methodologies to enhance the practical deployment of the model and to assess its efficacy under real-world conditions. Such efforts are crucial for a thorough appraisal of the model’s utility in the quality monitoring of agricultural products [32]. This research not only furthers the use of backpropagation neural networks in food safety surveillance but also underpins the technological advancement of automated food quality control systems [33].

5. Conclusions

In this study, a new method for evaluating the freshness of citrus fruits by combining near infrared spectroscopy with the non-linear data processing ability of a BP neural network was proposed. This method uses specific wavelength analysis to effectively distinguish fresh and non-fresh fruits. Advanced pre-treatment techniques were used to remove spectral anomalies, which enhanced the ability of the network to accurately identify key quality indicators such as sugar content.

This research employs a MATLAB-based backpropagation (BP) neural network to effectively address the complex non-linear relationships inherent in spectral data for the analysis of fruit freshness. Configured with 61 hidden layer nodes and refined through 1000 training iterations, the BP neural network achieved an accuracy rate of 97.6275%. This method is particularly relevant for analyzing non-linear interactions between spectral data and substance absorbances, such as sugar content, making it ideal for managing extensive near-infrared spectroscopy datasets. The computational efficiency and scalability of the BP neural network enhance data processing speeds, crucial in agricultural applications and the food industry.

The use of near-infrared spectroscopy to differentiate wave characteristics among fruits of varying freshness levels offers a robust method for assessing fruit freshness. This capability is critical for improving fruit preservation and storage practices, enabling precise traceability of fruit quality back to specific production batches or origins, thereby enhancing food safety and quality control standards.

Moreover, the non-invasive nature of near-infrared spectroscopy preserves the integrity of fruits during quality assessments, invaluable in contexts where appearance significantly influences pricing. This method eschews chemical reagents and destructive testing, aligning with contemporary green production practices and ensuring no harm to the environment or personnel.

Despite the simplicity, speed, and cost-effectiveness of near-infrared spectroscopy (NIRS) for detecting fruit and vegetable quality, several challenges persist. These challenges include the requirement for high sample uniformity and the susceptibility to external influences, such as variations in sample temperature and detection site conditions. Future research should prioritize the investigation of optical properties in standard samples, the exploration of characteristic absorption bands, and the development of tailored detection strategies for diverse fruit and vegetable types. Moreover, to enhance the practicality and industry acceptance of NIRS, efforts should be directed towards mitigating inter-instrument discrepancies, implementing temperature corrections, and refining wavelength and energy calibration techniques. These advancements are crucial for meeting stringent detection requirements, reducing instrument costs, and promoting broader adoption within the industry.

Author Contributions

Conceptualization, L.C. and Y.J.; methodology, Y.J.; software, J.Z.; validation, X.L., L.W. and Y.S.; formal analysis, L.C.; investigation, Y.J.; resources, L.C.; data curation, J.Z.; writing—original draft preparation, Y.J.; writing—review and editing, X.L.; visualization, R.Y.; supervision, L.C.; project administration, L.C.; funding acquisition, L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Special Basic Cooperative Research Programs of Yunnan Provincial Undergraduate Universities, grant number 202101BA070001-158 and “The APC was funded by Faculty of Mechanical and Electrical Engineering, Kunming University”.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wu, X.; Chen, B.; Li, X. Determination of Total Acids and Total Sugars in Citrus Fruits. Food Res. Dev. 2012, 33, 144–146. [Google Scholar]
Gou, Z.; Xu, W.; Liu, J.; Yi, Y.; Yang, Y. Research progress on extraction and analysis methods of flavonoids from citrus fruits. J. Food Saf. Qual. Insp. 2013, 4, 1556–1562. [Google Scholar]
Li, Y.; Zheng, Z.; Liu, L.; Xu, J.; Lei, R. Determination of Mineral Elements in Five Citrus Fruits. Food Ind. 2016, 37, 281–284. [Google Scholar]
Pashaei, M.; Hassanpour, H. Phenolic, amino acids, and fatty acids profiles and the nutritional properties in the fresh and dried fruits of black rosehip (Rosa pimpinellifolia L.). Sci. Rep. 2024, 14, 19665. [Google Scholar] [CrossRef] [PubMed]
Bertoglio, R.; Piliego, M.; Guadagna, P.; Gatti, M.; Poni, S.; Matteucci, M. On-the-go table grape ripeness estimation via proximal snapshot hyperspectral imaging. Comput. Electron. Agric. 2024, 226, 109354. [Google Scholar] [CrossRef]
Jiang, X.; Wu, W.; Ren, H.; Zhang, F.; Mo, W.; Lu, Z. Asymmetric hydrogenation of aromatic ketones catalyzed by a new chiral tridentate PNN ligand manganese complex. Acta Chem. Sin. 2024, 82, 736–898. [Google Scholar]
Zhang, S.; Wang, L.; Feng, J.; Jiang, Y.; Li, L.; Hu, Y.; Feng, J. Preparation of flexible silica aerogel composite felt based on aqueous fumed silica slurry. Sci. China (Mater.) 2024, 67, 1332–1339. [Google Scholar] [CrossRef]
Huang, Y.; Wang, Y.; He, Y.; Peng, J.; Min, S. Rapid determination of ethanol and 1,2-propanediol in tobacco flavors and fragrances by near infrared spectroscopy. Spectrosc. Spectr. Anal. 2023, 43 (Suppl. 1), 89–90. [Google Scholar]
Yang, Z.; Wang, H.; Tian, M.; Li, J.; Zhao, L. Near-infrared spectroscopy and multi-band camera imaging detection of early bruises in apples. Spectrosc. Spectr. Anal. 2024, 44, 1364–1371. [Google Scholar]
Li, Z.; Hou, M.; Cui, S.; Chen, M.; Liu, Y.; Li, X.; Chen, H.; Liu, L. Method for determination of flavonoid content in peanut kernel by near-infrared analysis. Spectrosc. Spectr. Anal. 2024, 44, 1112–1116. [Google Scholar]
Kang, M.; Wang, C.; Sun, H.; Li, L.; Luo, B. Study on internal quality detection method of cherry tomato based on improved WOA-LSSVM. Spectrosc. Spectr. Anal. 2023, 43, 3541–3550. [Google Scholar]
Cai, J.; Huang, C.; Ma, L.; Zhai, L.; Guo, Z. One-dimensional convolutional neural network handheld visible/near-infrared citrus soluble solids content non-destructive testing system. Spectrosc. Spectr. Anal. 2023, 43, 2792–2798. [Google Scholar]
Fan, P.; Chu, D.; Wang, H.; Li, L.; Liu, Y. Visible-Near Infrared Reflectance Spectra and Spectral Analysis of Soil Phosphorus Fractions. Spectrosc. Spectr. Anal. 2023, 43 (Suppl. 1), 73–74. [Google Scholar]
Jiang, X.; Zhu, M.; Yao, J.; Li, B.; Liao, J.; Liu, Y.; Zhang, J.; Jing, H. Parameter optimization of apple sugar content model based on near infrared on-line device. Spectrosc. Spectr. Anal. 2023, 43, 116–121. [Google Scholar]
Xiao, X.; Liu, F.; Sun, M.; Tang, Z.; Wu, Y.; Lyu, J.; Khan, K.S.; Yu, J. Development of a high-performance liquid chromatography method for simultaneous quantification of sixteen polyphenols and application to tomato. J. Chromatogr. A 2024, 1733, 465254. [Google Scholar] [CrossRef]
Sun, H.; Xue, J.; Song, Y.; Wang, P.; Wen, Y.; Zhang, T. Detection of fruit tree diseases in natural environments: A novel approach based on stereo camera and deep learning. Eng. Appl. Artif. Intell. 2024, 137, 109148. [Google Scholar] [CrossRef]
Liu, Z.; Feng, S.; Zhao, D.; Li, J.; Guan, Q.; Xu, T. Research on spectral feature extraction and detection method of rice leaf blast based on UAV hyperspectral remote sensing. Spectrosc. Spectr. Anal. 2024, 44, 1457–1463. [Google Scholar]
Guo, G.; Zhang, M.; Gong, Z.; Zhang, S.; Wang, X.; Zhou, Z.; Yang, Y.; Xie, G. Construction of biomass ash content model based on near-infrared spectroscopy and complex sample partition set. Spectrosc. Spectr. Anal. 2023, 43, 3143–3149. [Google Scholar]
Cui, T.; Lu, Z.; Xue, L.; Wan, S.; Zhao, K.; Wang, H. Study on rapid detection model of tomato sugar based on near infrared reflectance spectroscopy. Spectrosc. Spectr. Anal. 2023, 43, 1218–1224. [Google Scholar]
Zhao, Z.; Wang, X.; Liu, D.; Wang, Y.; Gu, Y.; Teng, J.; Niu, X. Nondestructive detection of plum fruit quality by near infrared spectroscopy based on BP-ANN and PLS. Spectrosc. Spectr. Anal. 2022, 42, 2836–2842. [Google Scholar]
Liu, Y.; Liao, J.; Li, B.; Jiang, X.; Zhu, M.; Yao, J.; Wang, Q. Robustness of global model of soluble solids in Gongli pear based on near infrared spectroscopy. Spectrosc. Spectr. Anal. 2022, 42, 2781–2787. [Google Scholar]
Liu, Y.; Li, M.; Hu, J.; Xu, Z.; Cui, H. Near-infrared hyperspectral detection of navel orange granulation. Spectrosc. Spectr. Anal. 2022, 42, 1366–1371. [Google Scholar]
Ma, H.; Zhang, K.; Ji, J.; Jin, X.; Zhao, K. Study on quantitative detection technology of freshness of Agaricus bisporus based on spectroscopy. Spectrosc. Spectr. Anal. 2021, 41, 3740–3746. [Google Scholar]
Wang, D.; Han, P.; Wu, J.; Zhao, L.; Xu, H. Nondestructive and rapid identification of waxy corn seed thermal damage particles based on near-ultraviolet-visible-shortwave near-infrared multispectral imaging data. Spectrosc. Spectr. Anal. 2021, 41, 2696–2702. [Google Scholar]
Zhao, S.; Yu, H.; Gao, G.; Chen, N.; Wang, B.; Wang, Q.; Liu, H. Peanut protein components and their subunits content determination by near infrared analysis. Spectrosc. Spectr. Anal. 2021, 41, 912–917. [Google Scholar]
Pan, J.; Qi, Y.; Chen, L.; Xia, Y.; Lv, X. Identification of pear leaf diseases based on visible/near-infrared hyperspectral imaging technology. Chin. J. Agric. Mach. Chem. 2024, 45, 162–169. [Google Scholar]
Zhang, Y.; Yang, G.; Wang, M.; Han, Z.; Zhu, G.; Shi, J.; Liu, X.; Han, T.; Zhou, X. Influencing factors of non-destructive detection of moisture content in fresh corn ears by near infrared spectroscopy. Acta Agric. Eng. Sci. 2024, 1–9. [Google Scholar]
Sun, T.; Li, H.; Kong, L.; Fan, Z. Effect of spot diameter of light source on near-infrared detection of apple core rot. Acta Agric. Eng. 2023, 39, 298–305. [Google Scholar]
Feng, H.; Li, L.; Wang, D.; Zhang, K.; Feng, M.; Song, H.; Li, R.; Han, P. Application progress of mid-infrared and near-infrared spectroscopy in quality detection of small grains. Spectrosc. Spectr. Anal. 2023, 43, 16–24. [Google Scholar]
Bi, S.; Li, X.; Shen, T.; Xu, Y.; Ma, L. Apple classification method based on multi-model evidence fusion. J. Agric. Eng. 2022, 38, 141–149. [Google Scholar]
Li, J.; Yu, M.; Li, M.; Zheng, Y.; Li, P. Nondestructive identification of chrysanthemum varieties by near infrared spectroscopy and pattern recognition. Spectrosc. Spectr. Anal. 2022, 42, 1129–1133. [Google Scholar]
Shao, Z.; Wang, H.; Dong, Z.; Yuan, Y.; Li, J.; Zhao, L. Apple early damage detection based on near-infrared camera imaging and threshold segmentation. Agric. Mach. J. 2021, 52 (Suppl. 1), 134–139. [Google Scholar]
Zhang, J.; Xu, Y.; Jiang, Y.; Zheng, C.; Zhou, J.; Han, C. Research progress on the application of near infrared spectroscopy in the quality detection of grape and its products. Spectrosc. Spectr. Anal. 2021, 41, 3653–3659. [Google Scholar]

Figure 1. Spectral map of fresh citrus.

Figure 2. Transitional type spectral pattern of stale citrus.

Figure 3. Diagram of the structure of the neural network.

Figure 4. Training indicators.

Figure 5. Neural network performance map.

Figure 6. Error prediction graph.

Figure 7. Regression fitting plot.

Figure 8. Data set vs. time amplitude plot.

Table 1. Spectral Data Pre-processing.

Procedure	Purpose	Methods
Baseline Correction	Remove background noise from instruments, the environment, and sample processing.	Polynomial fitting involves fitting a polynomial to the data and subtracting it to remove baseline trends. The rolling ball algorithm rolls a virtual ball along the spectral curve, using the lowest points to establish a new baseline, effectively eliminating low-frequency noise.
Noise Reduction	Remove random noise caused by things like instruments, electronics, and the environment.	The Savitzky–Golay filter uses local polynomial regression to smooth spectral data while preserving high-frequency features, effectively removing random noise. The Fourier transform filter processes spectral data by transforming it to the frequency domain, filtering out noise, and then reconverting it back to the time domain for clean analysis.
Normalization	Reduce differences between samples.	The Savitzky–Golay filter uses local polynomial regression to smooth spectral data, preserving high-frequency features while eliminating noise. Meanwhile, the Fourier transform filter processes spectral data by filtering noise in the frequency domain and then reverting to the time domain.
Derivative Spectroscopy	Improve the spectral data by enhancing subtle features and increasing sensitivity and resolution.	First derivative calculation emphasizes details and changes in spectral curves, aiding in resolving overlapping peaks. The second derivative more precisely captures subtle variations in data, particularly in peak shapes and positions.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, L.; Jia, Y.; Zhang, J.; Wang, L.; Yang, R.; Su, Y.; Li, X. Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy. Processes 2024, 12, 1939. https://doi.org/10.3390/pr12091939

AMA Style

Chen L, Jia Y, Zhang J, Wang L, Yang R, Su Y, Li X. Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy. Processes. 2024; 12(9):1939. https://doi.org/10.3390/pr12091939

Chicago/Turabian Style

Chen, Ling, Youdong Jia, Jianrong Zhang, Lei Wang, Rui Yang, Yun Su, and Xinzhi Li. 2024. "Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy" Processes 12, no. 9: 1939. https://doi.org/10.3390/pr12091939

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy

Abstract

1. Introduction

2. Materials and Methods

2.1. Construction of the Experimental Platform

2.2. Experimental Sample Collection

2.2.1. Polynomial Fitting

2.2.2. Rolling Ball Algorithm

2.3. Savitzky–Golay Filter

2.4. Fourier Transform Filter

2.5. Standard Normal Variable Transformation (SNV)

2.6. Vector Normalization

2.7. First Derivative

2.8. Second Derivative

2.9. Algorithm Model Building

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI