1. Introduction
Given the direct impact of fish quality on consumer health, the assessment of fish quality warrants significant attention. Moreover, the growing consumption of fish globally [
1], coupled with increasing concerns about food hygiene and safety among both regulatory bodies and consumers, underscores the urgency of addressing the challenges in fish quality control. According to statistics from the Institute of Marine Research, Vietnam’s fishery production from 2011 to 2015 included 493.9 thousand tons of mackerel, 42.6 thousand tons of tuna, and 344 thousand tons of pompano [
2]. Thus, it is evident that tuna, mackerel, and pompano are important seafood products in Vietnam’s fishing grounds. These three types of fish are used for both domestic markets and export, making them crucial to enhancing quality monitoring [
2].
Urea is a type of chemical fertilizer used in agriculture to increase nitrogen content for crops and is not a food preservative. However, in Vietnam, urea is mixed with ice to preserve seafood because when urea dissolves in water, the water becomes cold due to an endothermic reaction, thereby keeping the fish meat fresh for longer. The misuse of urea in preservation arises from several factors: prolonged sea trips, insufficient ice supplies, ease of accessibility and use, low cost of urea, and a lack of knowledge and awareness and poor attitudes of those involved in the seafood supply chain regarding food safety [
2]. These reasons create the risk of urea contamination in seafood. Urea is not included in the list of food additives permitted for use in food issued under Circular No. 27/2012/TT-BYT [
3], the Circular on food additive management published by the Ministry of Health of Vietnam in 2012, nor is it included in the Codex General Standard for Food Additives (GSFA, Codex STAN 192-1995) [
4]. Regular consumption of food containing urea, even in low amounts, can lead to chronic poisoning with symptoms such as prolonged insomnia, headaches, body aches, memory loss, frequent cramps, loss of appetite leading to malnutrition, intestinal ulcers, and imbalances in calcium and phosphorus causing osteoporosis [
5]. Eating food with high urea residue can result in acute poisoning with symptoms like abdominal pain, nausea, diarrhea, difficulty breathing, heart failure, and arteriosclerosis, which can lead to death [
5].
Standard methods for analyzing urea in the laboratory are often time-consuming, sample-destructive, expensive, and require skilled personnel [
6]. To promptly detect urea in seafood at markets, rapid analysis methods are needed to screen samples at risk of urea contamination without requiring sample destruction and complicated sample preparation. To the best of our knowledge, no research has been conducted on rapid and non-destructive analysis methods specifically for determining urea content in raw fish samples. We also find that most of the currently available rapid test kits for urea concentration are not specifically designed for fish samples. For example, the Urea/Ammonia Assay Kit (Rapid) produced by Neogen’s Megazyme is suitable for the rapid analysis of urea in water, beverages, milk, and food products [
7]. This kit utilizes an enzymatic spectrophotometric method for accurate determination of urea concentrations [
8]. However, it requires a spectrophotometer and two supplied enzymes, with a reaction time of approximately 10 min [
7,
8]. In Vietnam, a rapid detection kit for urea in frozen and fresh fish samples has been developed and commercialized by the Institute of Science and Technology, Ministry of Public Security of Vietnam. This kit, named UT12, has a limit of detection (LOD) of 1000 ppm [
9]. While this test kit offers the advantages of being cost-effective and capable of non-destructive urea detection in fish samples, the sample preparation process remains somewhat cumbersome, involving the application of water droplets to the fish body and gills. Additionally, the time required to obtain results is still relatively long, ranging from 1 to 5 min [
9]. Our investigation has indicated that existing rapid methods for urea detection in fish may exhibit certain limitations, including the necessity for sample preparation and prolonged detection times. Therefore, the development of novel approaches is warranted to address these challenges.
Recently, near-infrared (NIR) spectroscopy, coupled with machine learning (ML) techniques, has gained traction as an analytical approach for determining chemical composition and/or classifying food quality [
10,
11]. The NIR region, spanning wavelengths from 780 to 2500 nm, offers deep penetration into chemical substances [
10,
11]. Recent research [
12] has demonstrated the efficacy of combining NIR spectroscopy with ML for the rapid and non-invasive evaluation of fish quality, encompassing both quantitative and qualitative aspects. This innovative approach has been successfully employed to predict various fish attributes, including freshness [
13,
14,
15,
16,
17], fat content [
18,
19,
20], species identification [
21,
22], and geographical origin [
23,
24,
25,
26]. The application of NIR spectroscopy coupled with ML extends beyond the fishing industry, proving values in diverse food quality control ecosystems. Notable examples include the detection of meat [
27] and milk [
28] adulteration and other aspects of fruit quality assessment [
29]. It is expected that the recent development of low-cost handheld NIR spectrometers, along with ML, has opened up the possibility of developing rapid and on-site analysis methods for urea content based on NIR spectral measurements.
To address the limitations of existing rapid methods and support risk-based inspection, this study explores the potential of NIR spectroscopy coupled with ML for the rapid and non-destructive classification of raw fish samples based on urea content. Specifically, we aim to develop a novel method based on NIR spectroscopy and ML techniques to detect urea in fish meat with a LOD of 1000 ppm, but without sample preparation and with a reduced detection time. This innovative approach could streamline safety assessments within the seafood industry, enabling efficient product management and reducing losses. Moreover, regulatory bodies could utilize this technology to enhance their inspection protocols. For developing a urea detector with a LOD of 1000 ppm based on the coupling between ML and NIR spectroscopy, the development of an ML pipeline is necessary. This pipeline should be capable of accurately classifying the urea content of a fish sample as either Safe (below 1000 ppm) or Unsafe based on its NIR spectrum. To achieve this, a NIR spectrum dataset of multiple fish samples with various urea contents belonging to the two safety classes needs to be collected. Subsequently, various combinations of feature extraction techniques and potential ML models should be evaluated on the NIR dataset to identify the combination yielding the highest performance. Regarding the NIR spectral measurement of a fish sample, an important question arises concerning the relative importance of distinct anatomical regions of a fish in the classification problem of their urea content. Different hypotheses could be formulated due to the lack of previous studies in this field. Thus, this study will also identify the optimal location within a fish’s body to measure the NIR spectrum for accurate identification of the safety class of the given fish sample. The optimal measurement location could be either a specific position or any. In the latter case, the identification of the safety class based on urea content is considered position independent.
2. Materials and Methods
Figure 1 presents our study’s proposed workflow. It begins with the collection of NIR spectra, followed by the handling of missing data to ensure a complete dataset. Next, the data are divided into training, validation, and test sets to support the development of a robust predictive model. The workflow incorporates SMOTE (Synthetic Minority Over-Sampling Technique) [
30], which generates additional training data to address class imbalance. Afterward, the data undergo normalization and smoothing to ensure consistency and reduce noise. Feature extraction is then performed to identify and prioritize relevant information, enhancing model performance. These extracted features are used to train and validate the machine learning model, allowing fine-tuning of its parameters. Finally, the model’s performance is evaluated on an independent test set to ensure its ability to generalize to unseen data.
In the following subsections, we will describe each step in the workflow in detail.
2.1. Device and Software for Data Collection
We used a low-cost handheld NIR device, which is the DLP
® NIRscan™ Nano Evaluation Module (Texas Instruments, Dallas, TX, USA) [
31], to measure the NIR spectra of fish samples (
Figure 2). This spectrometer leverages digital light processing (DLP) technology, which replaces the traditional linear array detector with a digital micromirror device (DMD) for wavelength selection and a single-point detector. By sequentially scanning through the columns of the DMD, a particular wavelength of light is directed to the detector and captured. This NIR scanner supports both Bluetooth low energy and USB communications, which enable mobile and computer lab measurements. For NIR data collection, we employed the DLP NIRscan Nano GUI v2.1.0, a software that came with the device, to start the scanning process and transfer recorded spectra from the spectrometer to our Windows computer through a USB cable. We did not apply any calibration method. Regarding scanning methods, the Column method selects one wavelength at a time, while the Hadamard method creates a set with several wavelengths multiplexed at a time and then decodes the individual wavelengths. We have chosen the Hadamard method since it collects much more light and offers a greater signal-to-noise ratio than the Column one.
In the scanning process, part of the radiation in the NIR range emitted from the device was absorbed by the dissected samples. The remainder that is not absorbed is reflected back to the device sensor or transmitted through those substances. According to this, we could achieve absorbance, reflectance, and transmittance spectra simultaneously. Each spectrum consists of 228 wavelengths in the range of 900–1700 nm, i.e., a resolution of 3.5 nm per wavelength point. Among the three types of spectra, we decided to use the absorption spectrum to conduct experiments for the classification of urea content in fish.
2.2. Sample Collection
We collected 299 fish specimens, including 113 mackerel, 98 tuna, and 88 pompano samples. These specimens were sourced directly from offshore fishing vessels in Central Vietnam. The fish samples were immersed in ice flakes with the addition of three different concentrations of urea: 1%, 2%, and 3%.
2.3. NIR Measurement for Fish Samples
During an eight-hour period, a fish sample was removed from the soaking tank every two hours and allowed to equilibrate to room temperature, ranging from 25 °C to 35 °C. Following this, the samples were cleaned and dried using blotting paper. NIR spectra were recorded at four external locations on the fish’s skin: the nape, back, stomach, and tail. Subsequently, the fish was filleted, and NIR measurements were taken at four internal locations: again, the nape, back, stomach, and tail. These eight positions were selected to encompass the entire body of the fish and ensure stable measurements, avoiding the risk of fluctuating spectra from uneven surfaces, such as the gills, or highly humid regions, such as the eyes. Each of the eight positions was measured five times, resulting in a total of 40 distinct NIR spectrum samples per fish.
In this study, we simulated a scenario in which fishermen improperly use urea to preserve fish by soaking the samples in ice flakes supplemented with urea. To enhance the robustness of our findings, NIR spectra were measured under room temperature conditions without controlling ambient humidity. The dataset utilized in this study comprises 11,960 samples of NIR absorption spectra collected from 299 fish specimens.
2.4. NIR Dataset Labeling and Division
After NIR measurement, the filleted fish was ground using a blender. Then, its urea content was determined using a high-pressure liquid chromatography (HPLC) method described in [
6]. The principle of this method is that urea in the fish sample is derivatized with xanthydrol (9H-xanthen-9-ol) to form N-9H-xanthen-9-ylurea. This derivative is then analyzed using an HPLC system with a fluorescence detector, with an excitation wavelength of 213 nm and an emission wavelength of 308 nm. Forty NIR spectra were then assigned a safety label according to the urea content of the fish sample. In the event that the urea content of the fish fell below the established threshold of 1000 ppm, the NIR spectra associated with that fish were classified as “Safe”; conversely, if the urea content exceeded the permissible limit, the NIR spectra were classified as “Unsafe”.
Finally, the whole NIR spectrum dataset was divided into three subsets for the training, validating, and evaluating classification models, including training, validation, and test sets at the ratio of 3:1:1. The data division was carried out to satisfy the following two criteria: a fish sample and its associated spectrum samples only belonged to one subset, and the urea content distributions of the training, validation, and test sets were similar. These requirements were met to assure the objectiveness of the model building and evaluation processes.
2.5. Data Pre-Processing
In the pre-processing stage, we performed three techniques in a row, including missing data handling, data normalization, and data smoothing. If a wavelength of an absorption spectrum was missing, the missing absorbance value was replaced by the average of the absorbance values of the two neighboring wavelengths. Then, standard normal variate correction (i.e., z-score normalization) was applied to every single spectrum of the dataset to eliminate the deviations caused by particle size and scattering, making the NIR data consistent. Eventually, the NIR spectra were streamed through a Savitzky–Golay (SG) filter with a window length of 13 points and a polynomial order of five to smooth the spectra, thereby removing part of the noise [
32]. The pre-processing steps were conducted by using the Scipy v1.14.1 library.
What is especially notable about our dataset is the severe imbalance between the two safety classes. The number of NIR samples belonging to the “Safe” class is nearly six times higher than the “Unsafe” class. This can cause a classification model to be biased towards the majority class with the “Safe” label. To solve this problem, we leveraged the SMOTE technique to handle data imbalance. SMOTE specifically generates new data points for the minority class with the “Unsafe” label. It analyzes existing minority data points and generates new ones similar to them. By adding these synthetic samples, SMOTE balances the data, giving the model a better capability to learn the minority class. After applying SMOTE on the training subset, the number of NIR samples belonging to the “Unsafe” class is equal to that of the “Safe” class. The details of the SMOTE algorithm can be found in [
30]. We used the Python package named imbalanced-learn v0.11.0 to perform SMOTE. The synthetic spectrum samples were also normalized and smoothed in the same way as the original ones.
2.6. Feature Extraction
Relevant features need to be chosen for building classification models. For a fish sample, its pre-processed NIR spectrum is a certain choice for the feature vector for safety classification. We further examined the derivatives of the pre-processed spectrum to see if they can help to differentiate labels of safety. We investigated six types of feature vectors based on the concatenation of the pre-processed spectrum and its derivatives, as described in
Table 1.
2.7. Model Training and Validation
We used both the traditional ML and modern deep learning (DL) approaches to build classification models and compared their performances for the purpose of classifying a fish sample as safe or unsafe based on its urea content, which is reflected by its extracted NIR spectral features. For the traditional ML approach, four algorithms were experimented including decision tree (DT) [
33], k-nearest neighbors (KNN) [
34], support vector machine (SVM) [
35], and extreme gradient boosting (XGB) [
36]. For the DL approach, we employed a convolutional neural network (CNN) [
37] and proposed suitable architectures depending on the experiments. The model training and hyperparameter tuning processes were conducted by using the scikit-learn v1.4.0 toolkit for the conventional ML algorithms and the Keras v2.10.0 framework for the CNN models. After the optimal models were determined, their performances were evaluated on the common test set.
2.8. Improved Detection Setup
As each combination of the input feature type and the measurement position (and thus the corresponding NIR sub-dataset) leads to a different configuration of the CNN model, we only present the process of constructing and evaluating the CNN model, which achieved the highest classification accuracy to make this article concise.
Figure 3 describes the proposed CNN architecture in this case. The model includes one input layer, which contains 456 neurons as input data, representing the feature vector of size 456 × 1, which is of the type “prep + der1” (i.e., pre-processed spectrum concatenated with its first derivative). It consists of two convolutional layers, each of them followed by a pooling layer. The convolutional layers have kernels of size 8 × 1. They are alternated with two max-pooling layers with the pool size 2 × 1 and Rectified Linear Units (ReLUs) as the activation functions. The output of the final max pooling layer is streamed through a flattened layer in order to convert multi-dimensional data into one-dimensional data, which are then entered into the two fully connected (i.e., dense) layers. The first dense layer consists of eight neurons and a ReLU activation function. A dropout layer is placed before the two dense layers. Finally, the last dense layer contains two neurons where softmax classifier activation is used to predict the output (i.e., the safety label) of the model. The proposed CNN model consists of 7482 parameters.
The training process of this model was implemented using the Keras framework with the Adam optimizer and the initial learning rate at 0.001. The learning rate was set to be reduced by a factor of 0.8 when the training result was not progressing. The validation set was used to stop the training process. Given the substantial parameter count associated with the initial CNN architecture and the limited availability of training samples, the issue of overfitting emerged as a significant concern. Consequently, the incorporation of a dropout layer was deemed necessary in order to mitigate this challenge effectively.
Figure 4 shows how the cross-entropy-based loss function of the CNN model varied on the training and validation sets over training epochs. We stopped the training process after 50 epochs to prevent overfitting since the model had their losses converged on the validation set at this point.
5. Conclusions
The results of this study provide important insights into the application of NIR spectroscopy combined with ML techniques for classifying fish safety based on urea content. By comparing position-dependent and position-independent measurements, as well as different feature extraction methods and classifiers, the study has highlighted the critical factors influencing model performance and the potential for NIR-based approaches in food safety assessment.
5.1. Importance of Measurement Location
One of the most significant findings was the superior performance of position-dependent models, particularly those utilizing NIR data collected from the skin at the stomach region of the fish. Across all classifiers, this specific location consistently yielded higher accuracy, suggesting that the stomach area may provide more relevant spectral information for detecting urea content, possibly due to the concentration or distribution of chemical compounds in that region. This insight is crucial as it indicates that, for practical applications of NIR spectroscopy in fish safety, careful consideration must be given to the anatomical site of measurement.
5.2. Performance of Machine Learning Models
The study demonstrated that traditional ML models, such as Decision Trees (DT), K-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Extreme Gradient Boosting (XGB), could effectively classify fish samples with varying degrees of accuracy depending on the feature type and measurement position. However, the Convolutional Neural Network (CNN) model, particularly when combined with the “prep + der1” feature extraction technique, outperformed these traditional methods. The CNN model’s ability to achieve an accuracy of 83.9% underscores the potential of DL techniques in handling complex spectral data, where subtle patterns in the NIR spectra can be leveraged for more accurate classification.
5.3. Role of Feature Extraction
The effectiveness of different feature extraction methods varied across models, with combined features (e.g., pre-processed spectrum with its first derivative) generally leading to better performance. This suggests that incorporating multiple levels of spectral information helps in capturing more discriminative features, which is particularly beneficial for complex models like CNNs. The superior performance of the “prep + der1” combination in the CNN model highlights the importance of selecting appropriate feature vectors that align with the model’s architecture and learning capacity.
5.4. Implications for Practical Application
The findings from this study have practical implications for the development of NIR-based fish safety testing tools. The high accuracy achieved by the CNN model, particularly at the stomach region, suggests that such models could be integrated into rapid, non-destructive testing devices. These devices could offer food safety authorities and industry stakeholders a cost-effective alternative to traditional laboratory methods, allowing for on-site testing and quicker decision-making regarding fish safety.
However, the study also points to limitations, such as the relatively small dataset and the use of a low-cost NIR scanner, which may affect the generalizability of the results. Despite efforts to mitigate this through techniques like dropout and SMOTE, the potential for overfitting in CNN models is another consideration that warrants further exploration.
5.5. Future Research Directions
To build on these findings, future research should focus on expanding the dataset in terms of both sample size and species diversity and exploring the use of higher-quality NIR equipment. Additionally, investigating other safety-related compounds in fish, such as histamine or borax, would help validate the broader applicability of NIR spectroscopy combined with ML/DL methods. Further research could also explore the integration of chemometrics and multivariate data analysis to enhance feature extraction and model performance, potentially enabling more sophisticated classification systems capable of distinguishing among multiple safety classes.
To summarize, this study has laid a strong foundation for the use of NIR spectroscopy and machine learning in fish safety assessment, with significant implications for both research and practical applications in food safety.