Next Article in Journal
A New Method for the Determination of Total Content of Vitamin C, Ascorbic and Dehydroascorbic Acid, in Food Products with the Voltammetric Technique with the Use of Tris(2-carboxyethyl)phosphine as a Reducing Reagent
Previous Article in Journal
Green Synthesis of Silver Nanoparticles Using Salvia verticillata and Filipendula ulmaria Extracts: Optimization of Synthesis, Biological Activities, and Catalytic Properties
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Structural Analysis and Classification of Low-Molecular-Weight Hyaluronic Acid by Near-Infrared Spectroscopy: A Comparison between Traditional Machine Learning and Deep Learning

1
NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, School of Pharmaceutical Sciences, Shandong University, Jinan 250012, China
2
National Glycoengineering Research Center, Shandong University, Jinan 250012, China
3
Key Laboratory of Chemical Biology (Ministry of Education), Shandong University, Jinan 250012, China
4
Bloomage Biotechnology Corporation Limited, Jinan 250012, China
*
Author to whom correspondence should be addressed.
Molecules 2023, 28(2), 809; https://doi.org/10.3390/molecules28020809
Submission received: 15 December 2022 / Revised: 6 January 2023 / Accepted: 10 January 2023 / Published: 13 January 2023

Abstract

:
Confusing low-molecular-weight hyaluronic acid (LMWHA) from acid degradation and enzymatic hydrolysis (named LMWHA–A and LMWHA–E, respectively) will lead to health hazards and commercial risks. The purpose of this work is to analyze the structural differences between LMWHA–A and LMWHA–E, and then achieve a fast and accurate classification based on near-infrared (NIR) spectroscopy and machine learning. First, we combined nuclear magnetic resonance (NMR), Fourier transform infrared (FTIR) spectroscopy, two-dimensional correlated NIR spectroscopy (2DCOS), and aquaphotomics to analyze the structural differences between LMWHA–A and LMWHA–E. Second, we compared the dimensionality reduction methods including principal component analysis (PCA), kernel PCA (KPCA), and t-distributed stochastic neighbor embedding (t-SNE). Finally, the differences in classification effect of traditional machine learning methods including partial least squares–discriminant analysis (PLS-DA), support vector classification (SVC), and random forest (RF) as well as deep learning methods including one-dimensional convolutional neural network (1D-CNN) and long short-term memory (LSTM) were compared. The results showed that genetic algorithm (GA)–SVC and RF were the best performers in traditional machine learning, but their highest accuracy in the test dataset was 90%, while the accuracy of 1D-CNN and LSTM models in the training dataset and test dataset classification was 100%. The results of this study show that compared with traditional machine learning, the deep learning models were better for the classification of LMWHA–A and LMWHA–E. Our research provides a new methodological reference for the rapid and accurate classification of biological macromolecules.

1. Introduction

Hyaluronic acid (HA) is a glycosaminoglycan composed of the basic structure of disaccharides (D-glucuronic acid and N-acetylglucosamine) [1]. Due to its unique molecular structure as well as physical and chemical properties, it has physiological functions such as lubrication, moisturizing, and viscoelasticity, which makes it widely used in biomedical and clinical fields [2,3,4,5]. In 2021, HA was approved by the National Health Commission of the People’s Republic of China for use in general food in the Chinese market. HA with a molecular weight above 106 Da is called high-molecular-weight HA (HMWHA), and HA with a molecular weight below 106 Da is called low-molecular-weight HA (LMWHA) [6]. Compared with HMWHA, LMWHA has higher permeability and higher biological activity, such as promoting wound healing, inhibiting tumor proliferation, immune regulation, etc. [7,8,9,10], giving it broad application prospects in the fields of medicine, food, and healthcare, etc. In industry, LMWHA is usually obtained by acid degradation or enzymatic hydrolysis [11,12]. The acid degradation method has a high reaction and low cost, but it has the risk of destroying the basic disaccharide structural unit of HA, and there is residue, which poses a threat to human health [12]. The enzymatic hydrolysis has mild action and specific cleavage sites, which will not destroy the basic structure of HA [13]. Meanwhile, hyaluronidase as a natural ingredient is harmless to the human body [14], but a disadvantage is that the cost is high. In order to save production costs, some illegal producers may use acid-degraded LMWHA (LMWHA–A) in place of enzyme-hydrolyzed LMWHA (LMWHA–E), which will put the health of LMWHA users at high risk all the time. Especially when LMWHA–A is used in clinical medicine, the hidden dangers and losses caused by it are immeasurable.
For LMWHAs with the same molecular weight but different degradation methods, it is difficult to distinguish them using simple methods, whether they are solid powders or in aqueous solutions. The most effective method in the past was to use nuclear magnetic resonance (NMR) to analyze the difference in chemical structure, but the analysis of the NMR spectrum is not easy, and the test time is long, which is not conducive to continuous monitoring, and the test cost is relatively expensive. Therefore, it is necessary to establish a fast, accurate, and efficient method to distinguish LMWHAs.
The wavelength of the near-infrared (NIR) spectrum is between 780 nm and 2526 nm, and NIR can obtain molecular vibration information of hydrogen-containing groups, so it is widely used in the field of bioscience and related subjects [15,16]. NIR spectra have the characteristics of severe overlapping of spectral bands, so the spectra must be further analyzed with the help of knowledge and techniques in the field of chemometrics [17,18]. In recent years, with the rapid development of machine learning, especially deep learning, the content of chemometrics has been greatly enriched, and it is useful in solving complex problems [19].
Classification algorithms are an important branch of machine learning. Traditional machine learning classification methods include partial least squares–discriminant analysis (PLS-DA) [20], decision trees (DTs) [21], random forest (RF) [22], Naive Bayes [23], the k-nearest neighbor algorithm (KNN) [24], and support vector machines (SVMs) [25]. A traditional machine learning model belongs to shallow learning in essence. Its model has lower complexity, stronger interpretability of features, lower requirements on computer performance, and faster training speed [26]. However, its limitation lies in its limited ability to represent complex functions in the case of limited samples and computing units [27]. For complex classification problems, its generalization ability is restricted to some extent. At present, it is popular to choose the evolutionary algorithm represented by the genetic algorithm (GA) or the optimization model of the swarm intelligence algorithm represented by the particle swarm optimization (PSO) algorithm to optimize the model [28,29]. As an emerging algorithm in the field of machine learning, deep learning integrates feature learning and model building into one model by selecting different kernels and adjusting parameters through end-to-end optimization, and is proving to have good processing capability for complex non-linearly separable data [30,31]. The current popular deep learning classification methods are mainly based on convolutional neural networks (CNNs) [32], recurrent neural networks (RNNs) [33], and deep neural networks (DNNs) [34].
In the present work, we first performed the structural analysis of LMWHA–A and LMWHA–E by NMR and two-dimensional correlated NIR spectroscopy (2DCOS). Second, the differences in the types of water molecules in the two aqueous LMWHA solutions were analyzed by applying the theory of aquaphotomics in order to explain the structural differences between the two from the side. Third, we employed a series of linear and nonlinear dimensionality reduction methods including principal component analysis (PCA), kernel PCA (KPCA), and t-distributed stochastic neighbor embedding (t-SNE) to observe the distribution of the dataset in 3D space. Fourth, we compared several traditional machine learning classification methods, and applied several intelligent optimization algorithms to improve support vector classification (SVC). Finally, we established the classification models using a one-dimensional CNN (1D-CNN) and long short-term memory (LSTM) model in deep learning. Figure 1 shows a flow diagram of this study.

2. Results and Discussion

2.1. NMR and FTIR Spectrum Description

Figure S1 shows the NMR spectra of LMWHA–A and LMWHA–E. Comprehensively analyzing the 1D and 2D spectra, it can be found that whether it was LMWHA–A or LMWHA–E, the low-field region of the carbon spectrum (chemical shifts between 168 ppm and 174 ppm) had carbon signals of carboxyl and amide groups. However, LMWHA–A has an absorption peak between 172 ppm and 173 ppm, which represented the signal obtained by the hydrolysis of amide groups. Figure S2 shows the Fourier transform infrared (FTIR) spectra of LMWHA–A and LMWHA–E solutions. The yellow area in Figure S2 highlighted the differences in band intensity, shape, and chemical shift between LMWHA–A and LMWHA–E. Among them, the difference in absorption peak from 1250 cm−1 to 1580 cm−1 can be attributed to the changes in amide group and the symmetric C-O stretching vibrations of the ether bond [35], while the difference in absorption peak between 1750 cm−1 and 2400 cm−1 can be attributed to C=O stretching and C-H bending of amide group [36]. Therefore, it can be speculated that under acid degradation conditions, the C-N bond of the amide group was cleaved, while the carboxyl group in the primary structure of HA remained. However, under enzymatic hydrolysis conditions, neither the carboxyl group nor the amide group was cleaved. On the other hand, it can be seen from the terminal carbon signal of NMR spectra that the chemical shift of the terminal carbon of the monosaccharide on the sugar chain was around 107 ppm, and the chemical shift of the terminal carbon signal of the monosaccharide generally appeared around 100 ppm [36,37]. Under acid degradation conditions, no signal was found around 109 ppm, while the signal was more abundant around 100 ppm, while in the LMWHA–E spectra, a carbon signal appeared around 107 ppm, which was the signal of the terminal carbon of the unbroken disaccharide unit. From this, it can be inferred that under acid degradation conditions, the ether bonds connecting the monosaccharides of HA are broken, while enzymatic hydrolysis does not break the ether bonds [38].
The results in this section suggest that enzymatic hydrolysis did not destroy the basic building blocks of HA, whereas acid degradation did the opposite. Figure S3 shows the deduced chemical structures of LMWHA–A and LMWHA–E. Importantly, human hyaluronidase is only capable of specific degradation of structurally intact HA [39]. Therefore, residues of LMWHA–A are at risk of accumulation in humans.

2.2. NIR Spectrum Description

The raw NIR spectra of LMWHA solution samples in the 780 nm–2500 nm frequency regions are shown in Figure 2a. After preprocessing with the Savitzky–Golay (SG) smoothing filter and multiplicative Scatter Correction (MSC) method, the noise was suppressed, and the spectrum appeared smoother than the raw spectrum (as shown in Figure 2b). The whole spectrum showed the remarkable features of the water system: there were four bands around 970 nm, 1190 nm, 1450 nm, and 1940 nm, which reflected the second overtone of the OH stretching band, a combination of the first overtone of the OH stretching and OH bending band, the first overtone of the OH stretching band, and a combination of the OH stretching and OH bending band, respectively [40,41].

2.3. Analysis of 2DCOS Synchronous and Asynchronous Spectra

As an auxiliary analytical tool for one-dimensional spectroscopy, 2DCOS can help identify the chemical information of overlapping peaks and small peaks. In a HA solution, hydrogen bonds are formed between water molecules, between HA, and between HA and water molecules [42]. In order to facilitate the analysis of the differences between LMWHA–A and LMWHA–E in an aqueous solution, the first overtone (1300 nm–1600 nm) of O–H and hydrogen bonds of water molecules was taken as the signal region of interest, and 2DCOS synchronous and asynchronous spectra were obtained according to the Noda algorithm [43].
By comparing Figure 3a1 and Figure 3a2, it can be found that the synchronous cross-peaks of both LMWHA–A and LMWHA–E were positive at 1300 nm–1600 nm, and the peak intensity increased with the increase of wavelength; At 1500 nm–1600 nm (the first overtone stretching vibration of the hydrogen-bonded hydroxyl group), the intensity of the cross-peaks was significantly higher than that of the remaining regions reflecting the first overtone stretching vibration information of the free hydroxyl group. This indicated that acid degradation and enzymatic hydrolysis strongly disturbed the hydrogen bond in an aqueous solution.
By comparing Figure 3b1 and Figure 3b2, it can be found that LMWHA–A had four automatic peaks of similar intensity in the red region of 1525 nm–1600 nm, while LMWHA–E has only one automatic peak of maximum intensity in this region. This indicated from the side that acid degradation and enzymatic hydrolysis had different changes in HA structure.
By comparing Figure 3c1 and Figure 3c2, it can be further found that the cross-peak symbols of the asynchronous spectrum of LMWHA–A were almost the same in the range of 1550 nm to 1600 nm, while there were obvious differences in LMWHA–E. According to Noda’s theory, in the synchronous and asynchronous 2DCOS plots, the symbols of the cross-peaks located at λ1 and λ2 can be used to reveal the order of spectral intensity change at λ1 and λ2 [44]. If the same sign is observed in the synchronous and asynchronous cross-peaks, it indicates that the intensity change of λ1 occurs before that of λ2, while the opposite sign in the synchronous and asynchronous cross-peaks indicates that the band intensity change of λ2 occurs before that of λ1. Therefore, it was not difficult to determine that the disturbance sequence of enzymatic hydrolysis to the first overtone stretching vibration of the hydrogen-bonded hydroxyl group occurred before the disturbance of the first overtone stretching vibration of the free hydroxyl group. Although acid degradation has a similar tendency, the effect is not as obvious as that of enzymatic hydrolysis.
Therefore, it can be inferred that enzymatic hydrolysis can evolve HA solution in a simpler direction, and its influence on the hydrogen bond layout in the solution is obvious. However, the treatment of acid degradation is messy and has no strong directionality to the hydrogen bond layout in an aqueous solution. This is confirmed by the enzymatic digestion method retaining the primary structure of HA; that is, more hydrogen bonds tended to be formed in the solution due to the formation of more short-chain polysaccharides with complete structures. This result confirmed the inference described in Section 2.1.

2.4. Aquaphotomics Analysis

The water spectrum contains information about covalent hydroxides and hydrogen bonds and is highly influenced by other molecules and environmental factors in a solution. It can be found from Figure 4 that LMWHA–A had a dominant absorption at 1346 nm–1375 nm, while LMWHA–E has a dominant absorption at 1480 nm–1513 nm. Table S1 lists the water matrix coordinates (WAMACs) and the vibrational information of the molecular structures they represent [45]. According to the analysis of Table S1, it can be found that there were more H2O asymmetric stretching vibrations and water solvation shells in the LMWHA–A solution, while there were more water molecules with three or four hydrogen bonds, H2O bending vibrations, and strongly bound water in LMWHA–E solution. This showed that LWMHA-E can more strongly associate water molecules through hydrogen bonds and promote the formation of more hydrogen bonds in aqueous solutions. This finding supported the conclusion in Section 2.3.

2.5. Sample Exploration by PCA, KPCA, and t-SNE

PCA is a commonly used data analysis method. It transforms the original data into a set of linearly independent representations of each dimension through linear transformation [46]. It can be used to extract the main feature components of the data and is often used for the dimensionality reduction of high-dimensional data.
Figure 5a shows the distribution of scores of LMWHA–A and LMWHA–E in the 2D space composed of the first two principal components (PC1 and PC2). It can be found that the two types of samples were not clearly distinguished in the spatial distribution. Figure 5b shows the correlation loadings of PC1 and PC2. It is not difficult to find that in the wavelength range covered by the green area, the direction of change of the correlation loadings of PC1 and PC2 was different (it was also different at the end of the entire wavelength, but as shown in Figure 2, the absorbance values at those wavelengths were too high, so that we did not care about these variables). Coincidentally, the green area highly overlaps with the first overtone of O–H and hydrogen bonds of water molecules, which suggested that changes in water molecules and hydrogen bonds were important internal factors for the distinction between LMWHA–A and LMWHA–E.
A PCA can identify underlying dominant features and provide a more concise and straightforward summary of relevant covariates, but it can only be applied to linearly separable datasets. If we apply a PCA to a non-linear dataset, we may obtain a poor dimensionality reduction result. LMWHA–A and LMWHA–E have a high similarity in structure, so it is necessary to try nonlinear dimensionality reduction methods. KPCA uses a kernel function to map the dataset to a high-dimensional feature space (a reproducing kernel Hilbert space), and then performs PCA in this high-dimensional space to achieve nonlinear dimensionality reduction of the data [47,48].
As shown in Figure 6, multiple kinds of kernel functions (Gaussian, polynomial, sigmoid, and Laplacian) were used for dimensionality reduction and visualization. From the 2D score plots, no matter which kernel function was used, LMWHA–A and LMWHA–E were not effectively distinguished. Although PCA and KPCA are mainly used for dimension reduction rather than cluster analysis or visualization, the results of these two methods at least illustrate one fact: LMWHA–A and LMWHA–E share many similarities in structure, resulting in high similarity in many features of their NIR spectra.
T-SNE is another popular method for nonlinear data dimensionality reduction, which tries to keep similar instances adjacent and separate dissimilar instances while reducing dimensionality [49]. One of its main advantages is that the original features of the dataset are preserved as much as possible in the mapping from high-dimensional to low-dimensional space; that is, two data points that are similar in high-dimensional space are also similar when mapped to low-dimensional space [26]. T-SNE is widely used in visualization in fields such as bioinformatics, biomedical signal processing, and natural language processing.
Figure 7a,b shows the visualization results of the t-SNE algorithm reducing the data dimension to 3D space and 2D plane, respectively. It can be seen that LMWHA–A and LMWHA–E were well distinguished in 3D space. Although there were still a small number of samples mixed together, the overall visualization is better than PCA and KPCA. Considering the principle of t-SNE, we believe that this is because the t-SNE algorithm introduces t-distribution, which is a kind of long-tail distribution, which can tolerate the influence of outliers on most samples to a higher degree, so as to make better use of the overall characteristics of data and improve the robustness of the algorithm.

2.6. Sample Classification Based on Traditional Machine Learning Methods

2.6.1. PLS-DA

PLS-DA is essentially a classification method based on eigenvariables. It can decompose the spectra matrix and the response variable orthogonally at the same time, establish a regression relationship between them, and obtain a better classification effect than PCA in the projection map [20]. For the binary classification problem to be solved in this study, the response variables of the known categories were set to 0 (LMWHA–A) and 1 (LMWHA–E), and then the predicted response variable values were rounded and compared with the real labels to finally calculate the classification accuracy. As shown in Figure 8, after leave-one-out cross-validation (LOOCV), 4 of 80 training set samples were misclassified (see Figure 8a; misclassified samples are marked in red), and 2 of 10 test set samples were misclassified (see Figure 8b; misclassified samples are marked in red), and the results were not perfect.

2.6.2. SVC and Optimized SVCs

The basic idea of SVC is to establish a hyperplane as a decision surface based on the principle of structural minimization, which maximizes the isolation margin between samples of different categories [25]. SVC first uses the selected kernel function to nonlinearly map the training set from the input space to a high-dimensional feature space and then completes linear classification in this space. Therefore, different kernel functions lead to different classification effects. At present, the kernel function that is recognized as having the best effect in the classification problem of small sample data is the radial basis function (RBF) [50]. However, the hyperparameters C and g in the BRF kernel function affect the performance of the classifier [51]. C is the penalty coefficient, that is, the tolerance for errors. The larger C is, the more intolerable the error, and it is easy to overfit. The smaller C is, the easier it is to underfit. If C is too large or too small, the generalization ability deteriorates. G implicitly determines the distribution of the data after it is mapped to the new feature space. The larger g is, the fewer support vectors, and the smaller g is, the more support vectors. The number of support vectors affects the speed of training and prediction. In order to improve the accuracy of a classification and speed up the operation, some optimized algorithms have been proposed. Among them, grid search (GS), GA, and PSO are the three most popular optimization methods at present. GS optimizes the model by traversing given parameter combinations and determines the best C and g through cross-validation [52]. GA is a kind of stochastic optimization search algorithm that evolved from the evolution law of biology (genetic mechanism of survival of the fittest). It can deal with multiple individuals in a group at the same time, reducing the risk of falling into a locally optimal solution [53]. PSO is a stochastic optimization algorithm based on swarm intelligence. It imitates the foraging behavior of birds and compares the search space to the flight space of birds. The optimal solution to be found is equivalent to the food that birds are looking for. Through continuous iterations and calculation of fitness value, the optimal solution is finally obtained [54]. Therefore, three intelligent algorithms, GS, GA, and PSO were used to optimize the parameters of the SVC kernel and were compared with the traditional SVC. Figure 9 shows the parameter selection results of GS−SVC, GA−SVC, and PSO−SVC.
Figure S4 shows the confusion matrix of the training dataset through SVC, GS−SVC, GA−SVC, and PSO−SVC. It can be found that the classification effect of GS−SVC on the training dataset reached 100%, which was the best among the three optimization methods. Figure S5 shows the receiver operating characteristic (ROC) curve and the area under the curve (AUC) of SVC and three optimized SVC of the training dataset. As an auxiliary evaluation index of the confusion matrix, the closer the ROC curve is to the upper left corner, the larger the AUC area and the better the performance of the classifier. The AUC of GS−SVC was 1, indicating that it had the best classification effect on the training dataset.
Figure S6 shows the confusion matrix of the test dataset with SVC and three optimized SVC. It can be found that the accuracy rate of GA−SVC for the classification of the test dataset was 90%, and the accuracy rates of GS−SVC and PSO−SVC were 80%, both of which are higher than the traditional SVC (70%). Figure S7 shows the ROC curve and AUC of the test dataset with SVC and three optimized SVC. It can be found that the AUC of GA−SVC classification was the largest, reaching 0.96, and the AUC after GS−SVC and PSO−SVC classification were both 0.8, which were higher than the AUC of the traditional SVC method (0.64).
Table 1 shows all of the classification metrics of SVC, GS−SVC, GA−SVC, and the PSO−SVC model, including accuracy, precision, specificity, sensitivity (recall), F1 score, and AUC. Based on the analysis of various indicators, the GS−SVC algorithm performed the best in the classification of the training dataset, and the GA−SVC algorithm performed the best in the classification of the test dataset. Compared with the traditional SVC, the three optimization methods improved the classification effect.
Nu-SVC is an SVC with a polynomial kernel, and its default degree is three. Nu represents the upper limit of the error rate of the training dataset, or the lower limit of the percentage of the support vector, which has a similar function to the penalty coefficient C in the SVC algorithm, and can control the intensity of the penalty. The value range of nu is (0,1], and the default value is 0.5. In order to fully compare the classification performance of nu-SVC, we calculated the models with nu values of 0.5, 0.6, 0.7, 0.8, and 0.9 (based on the situation in that we intend to solve a binary classification problem). Figures S8–S11 show the confusion matrix and ROC curves of the nu-SVC training dataset and test dataset.

2.6.3. RF Algorithm

RF is an algorithm that integrates multiple DTs through the idea of ensemble learning [55]. Its basic unit is a DT, and each decision tree is a classifier. For an input sample, n trees have n classification results. RF integrates the classification voting results of all DTs and designates the category with the most votes as the final output. The number of DTs is a key factor affecting the classification accuracy of the RF model. Therefore, we examined the impact of 50 to 1000 DTs (with an interval of 50) on model performance. It should be pointed out that out-of-bag (OOB) error is a common index for evaluating RF fitting ability, and it tends to be stable with the increase of model iterations. The larger the stable value is, the worse the fitting ability of the model is; otherwise, the better the fitting ability is. It can be seen from Figure 10a that although all OOB error rates were lower than 0.2, the OOB error rate was not zero no matter the number of DTs, which indicated that RF’s verification effect on the training dataset is not perfect. Figure 10b shows the RF classification results of the test dataset under different DTs. It can be found that the highest classification accuracy was not 100%, but in most cases, it can reach 80% or 90%. Overall, RF achieved similar classification prediction results to optimized SVCs shown in Section 2.6.2.
An important function of RF is to calculate the importance of features. Figure S12 shows the mean decrease in accuracy and Gini index. The larger the mean decrease in accuracy and Gini index, the higher the importance of the feature. By converting the feature coordinates into wavenumber coordinates, it can be found that after RF calculation, 1300 nm–1600 nm had the highest importance in the entire wavelength range, confirming the correctness of the theoretical basis for the application of aquaphotomics to the analysis of LMWHAs.

2.7. Sample Classification Based on Deep Learning Methods

2.7.1. 1D-CNN

A CNN is a feedforward neural network [56]. Its artificial neurons can respond to surrounding units within a part of the coverage area [57]. The weights and biases of a CNN model are tuned through backpropagation without the manual setting of parameters. It is currently a very popular deep-learning method in the field of computer vision. CNN is widely used in the processing of 2D image and 3D action signals, but it is still in its infancy in the analysis and application of 1D signals [58,59], especially NIR. Unlike the classic CNN, the moving direction of the convolution kernel of 1D-CNN is one-dimensional. Figure S13 shows the model architecture of our 1D-CNN (named 1D-CNN-7), which had seven neural layers: the input layer, convolution layer, rectified linear unit (ReLU) layer, maxpooling layer, fully connected (FC) layer, softmax layer, and output layer. Among these layers, the convolution layer extracts different features of the spectral matrix through convolution operations; the ReLU function can solve the problem of gradient explosion or gradient disappearance, and speed up the convergence process at the same time; the role of the max-pooling layer is to extract features again, and each neuron of it performs a pooling operation on the local receptive field; the fully connected layer can integrate the local information with category discrimination in the convolutional layer or the max-pooling layer; the softmax layer can normalize a numerical vector into a probability distribution vector, making the classification result more accurate [57].
Figure 11 depicts the loss and the accuracy curve of the 1D-CNN-7 training process. It can be seen that as the number of iterations increased, the loss curves of the training set and the cross-validation set tended to fit to zero, and the accuracy tended to fit to 100%. When the number of epochs exceeded 80, the classification accuracy of the training set and cross-validation set stabilized at 100%. Figure 12 depicts classification results for the training dataset and test dataset with 1D-CNN-7. The results showed that the 1D-CNN-7 model had an excellent fitting and generalization ability.
Currently, applying 1D-CNN to the field of NIR spectroscopy is in an exploratory and popular stage. 1D-CNN combined with NIR spectroscopy has achieved satisfactory results in the fields of herbal species identification [59,60], tissue cancer detection [61], and fruit traits [62], etc. To the best of our knowledge, to date, this study is the first application of 1D-CNN combined with NIR spectroscopy for the classification issue of polysaccharides.

2.7.2. LSTM

LSTM is a special and popular RNN, which is mainly used to solve the problem of gradient vanishing and gradient explosion during long sequence training [63]. Compared with other neural networks, LSTM is better at processing data with sequence changes, such as speech signals [64]. In our study, spectral data were regarded as data of sequence changes, and the LSTM model as shown in Figure S14 was constructed (the basic unit of LSTM can be seen in Figure S15 [65]). The function of the dropout layer is to add a probabilistic process to the neurons of each layer on the basis of the normal neural network to randomly discard some neurons to prevent overfitting. As far as we know, this study is the first time that LSTM and NIR spectra are combined to apply to the classification of biological macromolecules, especially polysaccharides.
Figure 13 shows the loss and the accuracy curve of the LSTM training process. Similar to the phenomenon in Section 2.7.1, with the increase in the number of iterations, the loss curve of the training set and the cross-validation set tended to zero fitting, and the accuracy tended to be 100% fitting. When the number of epochs exceeded 52, the classification accuracy of the training set and cross-validation set was stable at 100%. As shown in Figure 14, the classification accuracy of both the training dataset and test dataset after processing by the LSTM model was 100%, indicating that the effect of the LSTM algorithm was satisfactory.
Section 2.7.1 and Section 2.7.2 fully proved the superiority of the deep learning method in this research. In the past, we may have been troubled by the feature selection problem brought about by the high dimensionality of the NIR spectrum, but now deep learning happens to be able to properly deal with high-dimensional data and mine more information from it [66]. Traditional machine learning is shallow learning, and the performance of the model is highly dependent on effective feature wavelength extraction, which not only increases the complexity of the analysis work, but also heavily relies on the experience of researchers. Contrary to traditional machine learning methods, deep learning has excellent feature self-learning ability. The reason why the two deep learning models we used can achieve better results than traditional machine learning in our research, in addition to relying on the many advantages of the deep learning method itself, also needs to be attributed to the material basis of our research object—LMWHAs have complex intramolecular and intermolecular interactions in aqueous solution. We believe that in the near future, deep learning will make more dazzling breakthroughs in the application of spectral analysis methods represented by NIR spectroscopy that require the use of chemistry methods to analyze biological macromolecules, whether it is in the innovation of neural network structures or in solving more practical problems.

3. Materials and Methods

3.1. Samples

The average relative molecular weights of LMWHA–A and LMWHA–E were both 10.0 kDa. LMWHA–A and LMWHA–E were dissolved in deionized water at a concentration of 0.5 mg/mL. There were 9 batches of 90 LMWHA solutions. Each batch had 10 LMWHA solutions, of which 5 were LMWHA–A and the other 5 were LMWHA–E. All samples were in sterile packaging and stored in a 4 °C refrigerator for no more than 7 d prior to spectrum collection. Both LMWHA–A and LMWHA–E were derived from HMWHA with a molecular weight of about 2.0 × 106 Da. The former needs to react for three hours at a pH value of 1.5~2.0 and a temperature of 80 °C, while the latter needs to react for three hours at a pH value of 5.5~6.0 and a temperature of 37 °C. All samples were prepared and provided by Bloomage Biotechnology Co., Ltd. (Jinan, China).

3.2. NMR Spectral Data Acquisition and Processing

The NMR spectra were recorded at a temperature of 25 °C using a Bruker Avance 600 spectrometer (Bruker, Billerica, MA, USA). The nuclear magnetic hydrogen spectrum (1H-NMR), nuclear magnetic carbon spectrum (13C-NMR and DEPT 135°) and two-dimensional nuclear magnetic correlation spectrum (13C-1H HSQC) were measured. The dried samples were dissolved in deuterium oxide and placed in an NMR tube with an inner diameter of 0.5 mm for testing. The test frequency for 1H-NMR was 600 MHz, and for 13C-NMR and DEPT 135° it was 150 MHz [67]. The number of sample collection points for each type of NMR is 64 K. The number of scans for 1H-NMR was 128, while the number of scans for other NMRs was 16. The recovery delay was 2 s. The free induction decay (FID) signal measured by the NMR instrument was imported into MestReNova 14.0.1 software (Mestrelab Research, Bajo, Santiago de Compostela, Spain) for Fourier transformation, and the NMR spectra were obtained after phase correction and baseline correction [68]. The spectra were then saved as ASCII files and imported into SpecAlign 2.4.1 software (University of Oxford, Oxford, England) for peak matching [69].

3.3. FTIR Spectral Data Acquisition

FTIR spectra were collected using an Alpha II FTIR spectrophotometer (Bruker, Billerica, MA, USA) with a liquid cell module. The resolution was set to 2 cm1. The sampling temperature was set at 35 °C. Use the default number of scans of the instrument. In order to ensure the stability of the spectra, the instrument was preheated for more than 30 min.

3.4. NIR Spectral Data Acquisition and Sample Set Division

All NIR spectra were acquired by using a MATRIX-F FT-NIR spectrometer (Bruker, Billerica, MA, USA) equipped with a 1 mm cuvette. The spectral range is from 12,800 cm1 to 4000 cm1 (780 nm to 2500 nm). The resolution is 2 cm1. The spectrometer cannot be used until it has been switched on for 30 min and has passed the self-test procedure. Taking air as a reference and subtracting its absorbance from the sample spectrum, the sample test temperature was 25 °C. The number of scans was set to 64. Each sample to be tested was divided into three equal volumes, and their spectra were collected and averaged to serve as the final spectrum of the sample. Before building the machine learning models, a total of 80 samples in the first 8 batches were divided into a training set, and a total of 10 samples in the last batch were divided into a test set. The test set samples did not participate in any model cross-validation or parameter-seeking process. The size of the original spectral matrix is 90 (number of samples) × 4148 (number of variables).

3.5. NIR Spectral Preprocessing

In addition to the required sample characteristics, the information collected by NIR spectroscopy is often doped with unwanted irrelevant information and noise, such as stray light, strong electrical noise, and man-made noise in the transmission process [70]. Preprocessing spectral data can reduce system noise and enhance spectral features. The SG smoothing filter is a polynomial smoothing algorithm based on the principle of least squares, which can retain useful information in the analyzed signal and eliminate random noise [71]. MSC can effectively eliminate the spectral differences caused by different scattering levels of samples, thereby enhancing the correlation between spectra and data [72]. The two algorithms described above were used to preprocess the NIR spectra.

3.6. 2DCOS Analysis

2DCOS is one of the tools widely used for in-depth analysis of vibrational spectral data including NIR spectra [73]. In our present study, the perturbing factors of the 2DCOS were different degradation modes [74]. The 2DCOS is composed of synchronous and asynchronous spectra on the spectrogram [43]. The synchronous correlation spectrum is obtained as a covariance matrix of the measured spectra and the asynchronous correlation spectrum as a product of the matrix of measured spectra and the Hilbert–Noda transform [43]. The synchronous correlation is symmetric about the main diagonal. The peak located on the main diagonal is called the automatic peak. The automatic peak is always a positive peak, and its intensity represents the sensitivity of the absorption peak there to external disturbances. The peaks outside the main diagonal are called cross-peaks, which can be positive or negative, and their appearance indicates that there is a synergistic response between functional groups to external perturbing factors. A positive cross-peak indicates that the peak intensities of the two functional groups increase or decrease in the same direction with the change of external disturbance, and a negative cross-peak indicates opposite changes [75]. The asynchronous correlation is antisymmetric about the main diagonal. It has no automatic peaks, only cross-peaks outside the diagonal, representing whether there is strong chemical interaction, direct connection, or pairing between functional groups [76]. The asynchronous correlation can greatly improve the resolution of the spectrum.

3.7. Aquaphotomics Analysis

Aquaphotomics is a novel and efficient theory for the analysis of water systems, which uses the absorption spectral features of water to characterize samples to gather information about chemical composition and environmental conditions in an indirect manner [45]. Hydrogen bond is the main factor affecting the conformation of HA in an aqueous solution [77]. Aquaphotomics is able to analyze hydrogen bond information in water systems, so this method is particularly suitable for our study. Like most of the wavelengths chosen for research in the field of aquaphotomics, the near-infrared region of the first overtone of water at 1300 nm–1600 nm was selected for analysis in this study. The WAMACs refer to a protocol of aquaphotomics analysis proposed by Prof. Tsenkova and determined by an array of analyses [78]. Then, normalization was carried out at the selected absorbance band, and the results were finally presented in the form of radar maps.

3.8. Data Dimensionality Reduction

Data dimensionality reduction is beneficial to eliminating a large number of redundant or irrelevant variables contained in spectral data so as to realize the description of data with less feature dimensionality, which is usually used as a preprocessing step of traditional classification algorithms [79,80]. The dimensionality reduction effects of PCA, KPCA, and t-SNE were compared.

3.9. Sample Classification Based on Machine Learning Methods

Compared with traditional machine learning methods, deep learning generally does not require human intervention in the feature selection or dimensionality reduction process and has advantages in high-dimensional and large-sample data processing [81,82,83]. Based on this, we compared the classification effects of traditional machine learning methods (PLS-DA, SVC, and RF) and deep learning methods (CNN and LSTM). Since the calculation results of Section 3.8 were ultimately unsatisfactory, the input matrix of any classification model did not come from data dimension reduction, that is to say, the size of the input matrix of all classification models was 90 × 4148. In this part of the work, we also nested intelligent optimization algorithms including GS, GA, and PSO for SVC. After preprocessing the spectral data and before entering the machine learning step, normalization was conducted according to formula 1 (where X represents the spectral matrix before normalization, Xmax represents the maximum value in the matrix, Xmin represents the minimum value in the matrix, and X’ represents the matrix after normalization), aiming to avoid the influence of outliers and extreme values. All classification models were run more than 10 times to avoid accidental errors. Confusion matrices were used to characterize the accuracy, precision, specificity, sensitivity (recall), and F1 score of the classification results, and these five indicators were calculated according to formulas 2–6 [84,85]. For each confusion matrix, the rows corresponded to the predicted class and the columns corresponded to the true class. The diagonal cells corresponded to observations that were correctly classified which are called true positive (TP) and true negative (TN). The off-diagonal cells corresponded to incorrectly classified observations which are called false positive (FP) and false negative (FN). Both the number of observations and the percentage of the total number of observations were shown in each cell. The column on the far right of each matrix plot showed the percentages of all the examples predicted to belong to each class that was correctly and incorrectly classified. The row at the bottom of the plot showed the percentages of all the examples belonging to each class that were correctly and incorrectly classified. The cell in the bottom right of the plot showed overall accuracy. Meanwhile, the ROC curve and the AUC were used as the evaluation indexes of the classification results [86].
X = X   -   Xmin Xmax   -   Xmin
Accuracy   ( % ) = TP + TN TP + FN + FP + TN   ×   100
Precision   ( % ) = TP TP + FP   ×   100
Specificity   ( % ) = TN TN + FP   ×   100
Sensitivity   ( % ) = Recall   ( % ) = TP TP + FN   ×   100
F 1   Score = 2   ×   Precision   ×   Recall Precision + Recall
where TP = true positive, TN = true negative, FP = false positive, and FN = false negative.

3.10. Programming Language

MATLAB R2022a (MathWorks Inc., Natick, MA, USA) was used for calculation and visualization.

4. Conclusions

The penetration ability of LMWHAs has been further improved compared with that before degradation, so it has broad application prospects in the macromolecules field all over the world. Although both acid degradation and enzymatic hydrolysis can obtain LMWHAs, the former is harmful to human health and the environment. The accurate classification of LMWHA–A and LMWHA–E is beneficial to avoid health risks caused by the accumulation of chemical reagents and free residues. In this study, NIR spectroscopy combined with machine learning methods is a proven solution that is fast, accurate, environmentally friendly, and low-cost.
NMR, FTIR, 2DCOS, and aquaphotomics were used to analyze the difference in chemical structure between LMWHA–A and LMWHA–E, which is a prerequisite for accurate classification. In order to intuitively understand the spatial distribution of the two types of samples and eliminate the multicollinearity of the data, the applicability of linear (PCA) and nonlinear (KPCA and t-SNE) methods to the NIR spectra is compared. Then, based on the NIR spectra of samples, some representative machine learning methods were used to classify and identify LMWHA–A and LMWHA–E solutions. However, traditional machine learning methods (PLS-DA, SVC, and RF) did not perform adequately in classification. Finally, we tested the 1D-CNN-7 and LSTM models in the deep learning method and found that both models had excellent classification results.
It is worth mentioning that in order to improve model performance, the traditional NIR analysis method needs manual feature selection, while deep learning enables the computer to automatically learn the pattern features, which reduces the workload and has advantages in the study of complex systems due to its strong function approximation ability. In summary, we successfully classified two LMWHA solutions quickly and accurately based on NIR spectroscopy and deep learning. At the same time, our research is the first practice of comparing traditional machine learning and deep learning in the LMWHAs classification, which provides a methodological reference for the classification of biological macromolecules, especially polysaccharides.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/molecules28020809/s1, Figure S1: NMR spectra; Figure S2: Confusion matrices; Figure S3: ROC curves; Figure S4: Confusion matrices; Figure S5: ROC curves; Figure S6: Mean decrease in accuracy and Gini index by RF; Figure S7: LSTM unit; Figure S8: Confusion matrices; Figure S9: ROC curves; Figure S10: Confusion matrices; Figure S11: ROC curves; Figure S12: Mean decrease in accuracy and Gini index by RF; Figure S13: Model architecture; Figure S14: Model architecture; Figure S15: Basic unit; Table S1: WAMACs.

Author Contributions

Conceptualization, H.Z.; investigation, L.Z. (Lixuan Zang); methodology, L.N. and L.L.; validation, L.Z. (Liang Zhong); samples and financial support, X.G. and S.H.; software and visualization, W.T.; writing—original draft preparation, W.T.; writing—review and editing, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Shandong Province (ZR2022MB026), the Construction of Active Ingredient Sample Bank and Research on Innovative Drugs of Shandong Native Medicinal Herbs (2021CXGC010507), and the Qinghai Special Project of Innovation Platform for Basic Conditions of Scientific Research of China (2020-ZJ-T05).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this work are available in the article and supplementary materials.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Burdick, J.A.; Prestwich, G.D. Hyaluronic acid hydrogels for biomedical applications. Adv. Mater. 2011, 23, 41–56. [Google Scholar] [CrossRef] [PubMed]
  2. Gupta, R.C.; Lall, R.; Srivastava, A.; Sinha, A. Hyaluronic acid: Molecular mechanisms and therapeutic trajectory. Front. Vet. Sci. 2019, 6, 192–215. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Bayer, I.S. Hyaluronic acid and controlled release: A review. Molecules 2020, 25, 2649. [Google Scholar] [CrossRef]
  4. Chang, W.H.; Liu, P.Y.; Lin, M.H.; Lu, C.J.; Chou, H.Y.; Nian, C.Y.; Jiang, Y.T.; Hsu, Y.H. Applications of hyaluronic acid in ophthalmology and contact Lenses. Molecules 2021, 26, 2485. [Google Scholar] [CrossRef]
  5. Juncan, A.M.; Moisă, D.G.; Santini, A.; Morgovan, C.; Rus, L.L.; Vonica-Țincu, A.L.; Loghin, F. Advantages of hyaluronic acid and its combination with other bioactive ingredients in cosmeceuticals. Molecules 2021, 26, 4429. [Google Scholar] [CrossRef]
  6. Radrezza, S.; Aiello, G.; Baron, G.; Aldini, G.; Carini, M.; D’Amato, A. Integratomics of human dermal fibroblasts treated with low molecular weight hyaluronic acid. Molecules 2021, 26, 5096. [Google Scholar] [CrossRef]
  7. Gao, Y.; Sun, Y.; Yang, H.; Qiu, P.; Cong, Z.; Zou, Y.; Song, L.; Guo, J.; Anastassiades, T.P. A low molecular weight hyaluronic acid derivative accelerates excisional wound healing by modulating pro-inflammation, promoting epithelialization and neovascularization, and remodeling collagen. Int. J. Mol. Sci. 2019, 20, 3722. [Google Scholar] [CrossRef] [Green Version]
  8. Lokeshwar, V.B.; Mirza, S.; Jordan, A. Targeting hyaluronic acid family for cancer chemoprevention and therapy. Adv. Cancer. Res. 2014, 123, 35–65. [Google Scholar] [CrossRef] [Green Version]
  9. Zamboni, F.; Vieira, S.; Reis, R.L.; Oliveira, J.M.; Collins, M.N. The potential of hyaluronic acid in immunoprotection and immunomodulation: Chemistry, processing and function. Prog. Nat. Sci. 2018, 97, 97–122. [Google Scholar] [CrossRef]
  10. Rayahin, J.E.; Buhrman, J.S.; Zhang, Y.; Koh, T.J.; Gemeinhart, R.A. High and low molecular weight hyaluronic acid differentially influence macrophage activation. ACS Biomater. Sci. Eng. 2015, 1, 481–493. [Google Scholar] [CrossRef]
  11. Schmidt, J.; Pilbauerova, N.; Soukup, T.; Suchankova-Kleplova, T.; Suchanek, J. Low molecular weight hyaluronic acid effect on dental pulp stem cells in vitro. Biomolecules 2020, 11, 22. [Google Scholar] [CrossRef]
  12. Valcarcel, J.; García, M.R.; Varela, U.R.; Vázquez, J.A. Hyaluronic acid of tailored molecular weight by enzymatic and acid depolymerization. Int. J. Biol. Macromol. 2020, 145, 788–794. [Google Scholar] [CrossRef]
  13. Pang, B.; Wang, H.; Huang, H.; Liao, L.; Wang, Y.; Wang, M.; Du, G.; Kang, Z. Enzymatic production of low-molecular-weight hyaluronan and its oligosaccharides: A review and prospects. J. Agric. Food Chem. 2022, 70, 14129–14139. [Google Scholar] [CrossRef] [PubMed]
  14. DeLorenzi, C. New high dose pulsed hyaluronidase protocol for hyaluronic acid filler vascular adverse events. Aesthet. Surg. J. 2017, 37, 814–825. [Google Scholar] [CrossRef] [PubMed]
  15. Beć, K.B.; Grabska, J.; Huck, C.W. Near-infrared spectroscopy in bio-applications. Molecules 2020, 25, 2948. [Google Scholar] [CrossRef] [PubMed]
  16. Beć, K.B.; Huck, C.W. Advances in near-infrared spectroscopy and related computational methods. Molecules 2019, 24, 4370. [Google Scholar] [CrossRef] [Green Version]
  17. Tian, W.; Chen, G.; Zhang, G.; Wang, D.; Tilley, M.; Li, Y. Rapid determination of total phenolic content of whole wheat flour using near-infrared spectroscopy and chemometrics. Food Chem. 2021, 344, 128633. [Google Scholar] [CrossRef] [PubMed]
  18. Amirvaresi, A.; Nikounezhad, N.; Amirahmadi, M.; Daraei, B.; Parastar, H. Comparison of near-infrared (NIR) and mid-infrared (MIR) spectroscopy based on chemometrics for saffron authentication and adulteration detection. Food Chem. 2021, 344, 128647. [Google Scholar] [CrossRef]
  19. Yang, J.; Xu, J.; Zhang, X.; Wu, C.; Lin, T.; Ying, Y. Deep learning for vibrational spectral analysis: Recent progress and a practical guide. Anal. Chim. Acta 2019, 1081, 6–17. [Google Scholar] [CrossRef]
  20. Lee, L.C.; Liong, C.Y.; Jemain, A.A. Partial least squares-discriminant analysis (PLS-DA) for classification of high-dimensional (HD) data: A review of contemporary practice strategies and knowledge gaps. Analyst 2018, 143, 3526–3539. [Google Scholar] [CrossRef]
  21. Luo, X.; Wen, X.; Zhou, M.; Abusorrah, A.; Huang, L. Decision-tree-initialized dendritic neuron model for fast and accurate data classification. IEEE Trans. Neural Netw. Learn. 2022, 33, 4173–4183. [Google Scholar] [CrossRef] [PubMed]
  22. Speiser, J.L.; Miller, M.E.; Tooze, J.; Ip, E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 2019, 134, 93–101. [Google Scholar] [CrossRef] [PubMed]
  23. Saritas, M.M.; Yasar, A. Performance analysis of ANN and naive bayes classification algorithm for data classification. Int. J. Intell. Syst. 2019, 7, 88–91. [Google Scholar] [CrossRef] [Green Version]
  24. Pan, Z.; Wang, Y.; Pan, Y. A new locally adaptive k-nearest neighbor algorithm based on discrimination class. Knowl. Based Syst. 2020, 204, 106185. [Google Scholar] [CrossRef]
  25. Cervantes, J.; Garcia-Lamont, F.; Rodríguez-Mazahua, L.; Lopez, A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020, 408, 189–215. [Google Scholar] [CrossRef]
  26. Greener, J.G.; Kandathil, S.M.; Moffat, L.; Jones, D.T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 2022, 23, 40–55. [Google Scholar] [CrossRef]
  27. Du, M.; Liu, N.; Hu, X. Techniques for interpretable machine learning. Commun. ACM 2019, 63, 68–77. [Google Scholar] [CrossRef] [Green Version]
  28. Huang, W.; Liu, H.; Zhang, Y.; Mi, R.; Tong, C.; Xiao, W.; Shuai, B. Railway dangerous goods transportation system risk identification: Comparisons among SVM, PSO-SVM, GA-SVM and GS-SVM. Appl. Soft Comput. 2021, 109, 107541. [Google Scholar] [CrossRef]
  29. Kaseb, Z.; Rahbar, M. Towards CFD-based optimization of urban wind conditions: Comparison of genetic algorithm, particle swarm optimization, and a hybrid algorithm. Sustain. Cities Soc. 2022, 77, 103565. [Google Scholar] [CrossRef]
  30. Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; DePristo, M.; Chou, K.; Cui, C.; Corrado, G.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24–29. [Google Scholar] [CrossRef]
  31. Wang, J.; Ma, Y.; Zhang, L.; Gao, R.X.; Wu, D. Deep learning for smart manufacturing: Methods and applications. J. Manuf. Syst. 2018, 48, 144–156. [Google Scholar] [CrossRef]
  32. Sun, Y.; Xue, B.; Zhang, M.; Yen, G.G.; Lv, J. Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans. Cybern. 2020, 50, 3840–3854. [Google Scholar] [CrossRef] [Green Version]
  33. Banerjee, I.; Ling, Y.; Chen, M.C.; Hasan, S.A.; Langlotz, C.P.; Moradzadeh, N.; Chapman, B.; Amrhein, T.; Mong, D.; Rubin, D.L.; et al. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif. Intell. Med. 2019, 97, 79–88. [Google Scholar] [CrossRef] [PubMed]
  34. Hannun, A.Y.; Rajpurkar, P.; Haghpanahi, M.; Tison, G.H.; Bourn, C.; Turakhia, M.P.; Ng, A.Y. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 2019, 25, 65–69. [Google Scholar] [CrossRef] [PubMed]
  35. Chen, H.; Qin, J.; Hu, Y. Efficient degradation of high-molecular-weight hyaluronic acid by a combination of ultrasound, hydrogen peroxide, and copper ion. Molecules 2019, 24, 617. [Google Scholar] [CrossRef] [Green Version]
  36. Alkrad, J.A.; Mrestani, Y.; Stroehl, D.; Wartewig, S.; Neubert, R. Characterization of enzymatically digested hyaluronic acid using NMR, Raman, IR, and UV-Vis spectroscopies. J. Pharm. Biomed. Anal. 2003, 31, 545–550. [Google Scholar] [CrossRef]
  37. Mendes, M.; Cova, T.; Basso, J.; Ramos, M.L.; Vitorino, R.; Sousa, J.; Pais, A.; Vitorino, C. Hierarchical design of hyaluronic acid-peptide constructs for glioblastoma targeting: Combining insights from NMR and molecular dynamics simulations. J. Mol. Liq. 2020, 315, 113774. [Google Scholar] [CrossRef]
  38. Tokita, Y.; Okamoto, A. Hydrolytic degradation of hyaluronic acid. Polym. Degrad. Stab. 1995, 48, 269–273. [Google Scholar] [CrossRef]
  39. Buhren, B.A.; Schrumpf, H.; Hoff, N.-P.; Bölke, E.; Hilton, S.; Gerber, P.A. Hyaluronidase: From clinical applications to molecular and cellular mechanisms. Eur. J. Med. Res. 2016, 21, 5. [Google Scholar] [CrossRef] [Green Version]
  40. Bázár, G.; Romvári, R.; Szabó, A.; Somogyi, T.; Éles, V.; Tsenkova, R. NIR detection of honey adulteration reveals differences in water spectral pattern. Food Chem. 2016, 194, 873–880. [Google Scholar] [CrossRef]
  41. Kovacs, Z.; Bázár, G.; Oshima, M.; Shigeoka, S.; Tanaka, M.; Furukawa, A.; Nagai, A.; Osawa, M.; Itakura, Y.; Tsenkova, R. Water spectral pattern as holistic marker for water quality monitoring. Talanta 2016, 147, 598–608. [Google Scholar] [CrossRef]
  42. Dong, Q.; Guo, X.; Li, L.; Yu, C.; Nie, L.; Tian, W.; Zhang, H.; Huang, S.; Zang, H. Understanding hyaluronic acid induced variation of water structure by near-infrared spectroscopy. Sci. Rep. 2020, 10, 1387. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. Noda, I. Two-dimensional infrared spectroscopy. J. Am. Chem. Soc. 1989, 111, 8116–8118. [Google Scholar] [CrossRef]
  44. Noda, I. Determination of two-dimensional correlation spectra using the Hilbert transform. Appl. Spectrosc. 2000, 54, 994–999. [Google Scholar] [CrossRef]
  45. Muncan, J.; Tsenkova, R. Aquaphotomics—From innovative knowledge to integrative platform in science and technology. Molecules 2019, 24, 2742. [Google Scholar] [CrossRef] [Green Version]
  46. Johnstone, I.M.; Paul, D. PCA in high dimensions: An orientation. Proc. IEEE Inst. Electr. Electron. Eng. 2018, 106, 1277–1292. [Google Scholar] [CrossRef] [PubMed]
  47. Lee, J.-M.; Yoo, C.; Choi, S.W.; Vanrolleghem, P.A.; Lee, I.-B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. J. 2004, 59, 223–234. [Google Scholar] [CrossRef]
  48. Schölkopf, B.; Smola, A.; Müller, K.-R. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998, 10, 1299–1319. [Google Scholar] [CrossRef] [Green Version]
  49. Kobak, D.; Berens, P. The art of using t-SNE for single-cell transcriptomics. Nat. Commun. 2019, 10, 5416. [Google Scholar] [CrossRef] [Green Version]
  50. Hekmatmanesh, A.; Wu, H.; Jamaloo, F.; Li, M.; Handroos, H. A combination of CSP-based method with soft margin SVM classifier and generalized RBF kernel for imagery-based brain computer interface applications. Multimed. Tools. Appl. 2020, 79, 17521–17549. [Google Scholar] [CrossRef]
  51. Yan, X.; Jia, M. A novel optimized SVM classification algorithm with multi-domain feature and its application to fault diagnosis of rolling bearing. Neurocomputing 2018, 313, 47–64. [Google Scholar] [CrossRef]
  52. Sun, Y.; Ding, S.; Zhang, Z.; Jia, W. An improved grid search algorithm to optimize SVR for prediction. Soft Comput. 2021, 25, 5633–5644. [Google Scholar] [CrossRef]
  53. Soumaya, Z.; Taoufiq, B.D.; Benayad, N.; Yunus, K.; Abdelkrim, A. The detection of Parkinson disease using the genetic algorithm and SVM classifier. Appl. Acoust. 2021, 171, 107528. [Google Scholar] [CrossRef]
  54. Kour, V.P.; Arora, S. Particle swarm optimization based support vector machine (P-SVM) for the segmentation and classification of plants. IEEE Access 2019, 7, 29374–29385. [Google Scholar] [CrossRef]
  55. Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS-J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
  56. Yao, P.; Wu, H.; Gao, B.; Tang, J.; Zhang, Q.; Zhang, W.; Yang, J.J.; Qian, H. Fully hardware-implemented memristor convolutional neural network. Nature 2020, 577, 641–646. [Google Scholar] [CrossRef]
  57. Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6999–7019. [Google Scholar] [CrossRef]
  58. Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D convolutional neural networks and applications: A survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
  59. Chen, X.; Chai, Q.; Lin, N.; Li, X.; Wang, W. 1D convolutional neural network for the discrimination of aristolochic acids and their analogues based on near-infrared spectroscopy. Anal. Methods 2019, 11, 5118–5125. [Google Scholar] [CrossRef]
  60. Chai, Q.; Zeng, J.; Lin, D.; Li, X.; Huang, J.; Wang, W. Improved 1D convolutional neural network adapted to near-infrared spectroscopy for rapid discrimination of Anoectochilus roxburghii and its counterfeits. J. Pharm. Biomed. Anal. 2021, 199, 114035. [Google Scholar] [CrossRef]
  61. Shang, H.; Shang, L.; Wu, J.; Xu, Z.; Zhou, S.; Wang, Z.; Wang, H.; Yin, J. NIR spectroscopy combined with 1D-convolutional neural network for breast cancerization analysis and diagnosis. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2023, 287, 121990. [Google Scholar] [CrossRef] [PubMed]
  62. Mishra, P.; Passos, D. Multi-output 1-dimensional convolutional neural networks for simultaneous prediction of different traits of fruit based on near-infrared spectroscopy. Postharvest Biol. Technol. 2022, 183, 111741. [Google Scholar] [CrossRef]
  63. Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
  64. Zheng, J.; Ma, L.; Wu, Y.; Ye, L.; Shen, F. Nonlinear dynamic soft sensor development with a supervised hybrid CNN-LSTM network for industrial processes. ACS Omega 2022, 7, 16653–16664. [Google Scholar] [CrossRef] [PubMed]
  65. Birim, S.; Kazancoglu, I.; Mangla, S.K.; Kahraman, A.; Kazancoglu, Y. The derived demand for advertising expenses and implications on sustainability: A comparative study using deep learning and traditional machine learning methods. Ann. Oper. Res. 2022, 1, 1–31. [Google Scholar] [CrossRef]
  66. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  67. Keun, H.C.; Beckonert, O.; Griffin, J.L.; Richter, C.; Moskau, D.; Lindon, J.C.; Nicholson, J.K. Cryogenic probe 13C NMR spectroscopy of urine for metabonomic studies. Anal. Chem. 2002, 74, 4588–4593. [Google Scholar] [CrossRef]
  68. Palmioli, A.; Alberici, D.; Ciaramelli, C.; Airoldi, C. Metabolomic profiling of beers: Combining 1H NMR spectroscopy and chemometric approaches to discriminate craft and industrial products. Food Chem. 2020, 327, 127025. [Google Scholar] [CrossRef]
  69. Waidyanatha, S.; Pierfelice, J.; Cristy, T.; Mutlu, E.; Burback, B.; Rider, C.V.; Ryan, K. A strategy for test article selection and phytochemical characterization of Echinacea purpurea extract for safety testing. Food Chem. Toxicol. 2020, 137, 111125. [Google Scholar] [CrossRef]
  70. Torniainen, J.; Afara, I.O.; Prakash, M.; Sarin, J.K.; Stenroth, L.; Töyräs, J. Open-source python module for automated preprocessing of near infrared spectroscopic data. Anal. Chim. Acta 2020, 1108, 1–9. [Google Scholar] [CrossRef]
  71. Sohn, S.I.; Oh, Y.J.; Pandian, S.; Lee, Y.H.; Zaukuu, J.L.Z.; Kang, H.J.; Ryu, T.H.; Cho, W.S.; Cho, Y.S.; Shin, E.K. Identification of Amaranthus species using visible-near-infrared (vis-NIR) spectroscopy and machine learning methods. Remote Sens. 2021, 13, 4149. [Google Scholar] [CrossRef]
  72. Mishra, P.; Lohumi, S. Improved prediction of protein content in wheat kernels with a fusion of scatter correction methods in NIR data modelling. Biosyst. Eng. 2021, 203, 93–97. [Google Scholar] [CrossRef]
  73. Hong, Y.; Chen, S.; Zhang, Y.; Chen, Y.; Yu, L.; Liu, Y.; Liu, Y.; Cheng, H.; Liu, Y. Rapid identification of soil organic matter level via visible and near-infrared spectroscopy: Effects of two-dimensional correlation coefficient and extreme learning machine. Sci. Total Environ. 2018, 644, 1232–1243. [Google Scholar] [CrossRef]
  74. Noda, I. Recent advancement in the field of two-dimensional correlation spectroscopy. J. Mol. Struct. 2008, 883, 2–26. [Google Scholar] [CrossRef]
  75. Zanni, M.T.; Hochstrasser, R.M. Two-dimensional infrared spectroscopy: A promising new method for the time resolution of structures. Curr. Opin. Struct. Biol. 2001, 11, 516–522. [Google Scholar] [CrossRef]
  76. Le Sueur, A.L.; Horness, R.E.; Thielges, M.C. Applications of two-dimensional infrared spectroscopy. Analyst 2015, 140, 4336–4349. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  77. Knopf-Marques, H.; Pravda, M.; Wolfova, L.; Velebny, V.; Schaaf, P.; Vrana, N.E.; Lavalle, P. Hyaluronic acid and its derivatives in coating and delivery systems: Applications in tissue engineering, regenerative medicine and immunomodulation. Adv. Healthc. Mater. 2016, 5, 2841–2855. [Google Scholar] [CrossRef] [PubMed]
  78. Tsenkova, R.; Munćan, J.; Pollner, B.; Kovacs, Z. Essentials of aquaphotomics and its chemometrics approaches. Front. Chem. 2018, 6, 363. [Google Scholar] [CrossRef] [PubMed]
  79. Wei, H.L.; Billings, S.A. Feature subset selection and ranking for data dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 162–166. [Google Scholar] [CrossRef] [Green Version]
  80. Ayesha, S.; Hanif, M.K.; Talib, R. Overview and comparative study of dimensionality reduction techniques for high dimensional data. Inf. Fusion 2020, 59, 44–58. [Google Scholar] [CrossRef]
  81. Al Rahhal, M.M.; Bazi, Y.; AlHichri, H.; Alajlan, N.; Melgani, F.; Yager, R.R. Deep learning approach for active classification of electrocardiogram signals. Inf. Sci. 2016, 345, 340–354. [Google Scholar] [CrossRef]
  82. Azimi, S.M.; Britz, D.; Engstler, M.; Fritz, M.; Mücklich, F. Advanced steel microstructural classification by deep learning methods. Sci. Rep. 2018, 8, 2128. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  83. Wang, P.; Fan, E.; Wang, P. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognit. Lett. 2021, 141, 61–67. [Google Scholar] [CrossRef]
  84. Luque, A.; Carrasco, A.; Martín, A.; de Las Heras, A. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognit. 2019, 91, 216–231. [Google Scholar] [CrossRef]
  85. Deng, X.; Liu, Q.; Deng, Y.; Mahadevan, S. An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Inf. Sci. 2016, 340, 250–261. [Google Scholar] [CrossRef]
  86. Pepe, M.S.; Cai, T.; Longton, G. Combining predictors for classification using the area under the receiver operating characteristic curve. Biometrics 2006, 62, 221–229. [Google Scholar] [CrossRef]
Figure 1. A flow diagram of this study.
Figure 1. A flow diagram of this study.
Molecules 28 00809 g001
Figure 2. The NIR spectra of all LMWHA solution samples in the wavenumber range of 780 nm to 2500 nm. (a) Raw spectra; (b) Spectra after preprocessing with SG smoothing filter and MSC.
Figure 2. The NIR spectra of all LMWHA solution samples in the wavenumber range of 780 nm to 2500 nm. (a) Raw spectra; (b) Spectra after preprocessing with SG smoothing filter and MSC.
Molecules 28 00809 g002
Figure 3. Synchronous spectra of (a1) LMWHA–A and (a2) LMWHA–E; Auto-peak spectra of (b1) LMWHA–A and (b2) LMWHA–E; Asynchronous spectra of (c1) LMWHA–A and (c2) LMWHA–E.
Figure 3. Synchronous spectra of (a1) LMWHA–A and (a2) LMWHA–E; Auto-peak spectra of (b1) LMWHA–A and (b2) LMWHA–E; Asynchronous spectra of (c1) LMWHA–A and (c2) LMWHA–E.
Molecules 28 00809 g003
Figure 4. Aquagrams of LMWHA–A and LMWHA–E solutions tested at 25 °C.
Figure 4. Aquagrams of LMWHA–A and LMWHA–E solutions tested at 25 °C.
Molecules 28 00809 g004
Figure 5. (a) 2D score plot and (b) correlation loadings plot of PC1 and PC2 of LMWHAs.
Figure 5. (a) 2D score plot and (b) correlation loadings plot of PC1 and PC2 of LMWHAs.
Molecules 28 00809 g005
Figure 6. 2D score plots of KPCA with (a) Gaussian kernel, (b) polynomial kernel, (c) sigmoid kernel, and (d) Laplacian kernel.
Figure 6. 2D score plots of KPCA with (a) Gaussian kernel, (b) polynomial kernel, (c) sigmoid kernel, and (d) Laplacian kernel.
Molecules 28 00809 g006
Figure 7. (a) Distribution map of LMWHA–A and LWMHA-E in 3D space after t-SNE dimensionality reduction; (b) Distribution of LMWHA–A and LWMHA-E in 2D space after t-SNE dimensionality reduction.
Figure 7. (a) Distribution map of LMWHA–A and LWMHA-E in 3D space after t-SNE dimensionality reduction; (b) Distribution of LMWHA–A and LWMHA-E in 2D space after t-SNE dimensionality reduction.
Molecules 28 00809 g007
Figure 8. Classification results of (a) training dataset and (b) test dataset using PLS-DA method.
Figure 8. Classification results of (a) training dataset and (b) test dataset using PLS-DA method.
Molecules 28 00809 g008
Figure 9. Parameter selection results of (a) GS−SVC, (b) GA−SVC, and (c) PSO−SVC.
Figure 9. Parameter selection results of (a) GS−SVC, (b) GA−SVC, and (c) PSO−SVC.
Molecules 28 00809 g009
Figure 10. (a) OOB error rate and (b) classification accuracy with different numbers of DTs.
Figure 10. (a) OOB error rate and (b) classification accuracy with different numbers of DTs.
Molecules 28 00809 g010
Figure 11. (a) The loss and (b) the accuracy curve of the 1D-CNN-7 training process.
Figure 11. (a) The loss and (b) the accuracy curve of the 1D-CNN-7 training process.
Molecules 28 00809 g011
Figure 12. Classification results for (a) training dataset and (b) test dataset of 1D-CNN-7.
Figure 12. Classification results for (a) training dataset and (b) test dataset of 1D-CNN-7.
Molecules 28 00809 g012
Figure 13. (a) The loss and (b) the accuracy curve of the LSTM training process.
Figure 13. (a) The loss and (b) the accuracy curve of the LSTM training process.
Molecules 28 00809 g013
Figure 14. Classification results for (a) training dataset and (b) test dataset with LSTM.
Figure 14. Classification results for (a) training dataset and (b) test dataset with LSTM.
Molecules 28 00809 g014
Table 1. Classification metrics of SVC, GS−SVC, GA−SVC, and PSO−SVC models.
Table 1. Classification metrics of SVC, GS−SVC, GA−SVC, and PSO−SVC models.
MethodAccuracy (%)Precision (%)Specificity (%)Sensitivity/
Recall
(%)
F1 ScoreAUC
Training of SVC9092.592.188.190.20.9706
Training of GS−SVC1001001001001001
Training of GA−SVC98.897.597.610098.70.9819
Training of PSO−SVC93.89594.992.793.80.9750
Test of SVC706066.37566.70.6400
Test of GS−SVC80808080800.8000
Test of GA−SVC9010010083.390.90.9600
Test of PSO−SVC806071.4100750.8000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tian, W.; Zang, L.; Nie, L.; Li, L.; Zhong, L.; Guo, X.; Huang, S.; Zang, H. Structural Analysis and Classification of Low-Molecular-Weight Hyaluronic Acid by Near-Infrared Spectroscopy: A Comparison between Traditional Machine Learning and Deep Learning. Molecules 2023, 28, 809. https://doi.org/10.3390/molecules28020809

AMA Style

Tian W, Zang L, Nie L, Li L, Zhong L, Guo X, Huang S, Zang H. Structural Analysis and Classification of Low-Molecular-Weight Hyaluronic Acid by Near-Infrared Spectroscopy: A Comparison between Traditional Machine Learning and Deep Learning. Molecules. 2023; 28(2):809. https://doi.org/10.3390/molecules28020809

Chicago/Turabian Style

Tian, Weilu, Lixuan Zang, Lei Nie, Lian Li, Liang Zhong, Xueping Guo, Siling Huang, and Hengchang Zang. 2023. "Structural Analysis and Classification of Low-Molecular-Weight Hyaluronic Acid by Near-Infrared Spectroscopy: A Comparison between Traditional Machine Learning and Deep Learning" Molecules 28, no. 2: 809. https://doi.org/10.3390/molecules28020809

Article Metrics

Back to TopTop