Next Article in Journal
Integrated Temperature–Humidity Sensors for a Pouch-Type Battery Using 100% Printing Process
Previous Article in Journal
A Novel Non-Coherent Q-Ary TH-PPM Transceiver
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Application of Data Fusion in Traditional Chinese Medicine: A Review

by
Rui Huang
1,2,
Shuangcheng Ma
1,
Shengyun Dai
1,* and
Jian Zheng
1,*
1
National Institutes for Food and Drug Control, Beijing 102629, China
2
School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China
*
Authors to whom correspondence should be addressed.
Sensors 2024, 24(1), 106; https://doi.org/10.3390/s24010106
Submission received: 1 December 2023 / Revised: 22 December 2023 / Accepted: 22 December 2023 / Published: 25 December 2023
(This article belongs to the Section Chemical Sensors)

Abstract

:
Traditional Chinese medicine is characterized by numerous chemical constituents, complex components, and unpredictable interactions among constituents. Therefore, a single analytical technique is usually unable to obtain comprehensive chemical information. Data fusion is an information processing technology that can improve the accuracy of test results by fusing data from multiple devices, which has a broad application prospect by utilizing chemometrics methods, adopting low-level, mid-level, and high-level data fusion techniques, and establishing final classification or prediction models. This paper summarizes the current status of the application of data fusion strategies based on spectroscopy, mass spectrometry, chromatography, and sensor technologies in traditional Chinese medicine (TCM) in light of the latest research progress of data fusion technology at home and abroad. It also gives an outlook on the development of data fusion technology in TCM analysis to provide references for the research and development of TCM.

1. Introduction

Traditional Chinese medicine (TCM) is medicine whose clinical application is guided by traditional Chinese medical theories. According to the former study, which showed that TCM contributed to the treatment of COVID-19 owing to its efficacy and comprehensive therapeutic theory, this resulted in its widespread use worldwide [1]. TCM has a long history of use and is mainly derived from natural medicines and their processed products, which have comprehensive therapeutic effects on a variety of diseases, reliable clinical efficacy, and fewer toxic side effects. However, TCM is a complex system, characterized by a large number of chemical compositions, unclear pharmacological mechanisms of action, and uneven quality [2]. Additionally, the traditional evaluation of TCM relies on the human senses of taste, vision, smell, and touch, which are highly subjective and unstable. With the modernization and internationalization of TCM, more and more modern techniques have been used to comprehensively evaluate the quality of TCM, to elucidate the group of chemical substances exerting medicinal effects, and to produce TCM preparations, such as spectroscopy, chromatography, mass spectrometry, and sensor technology. After a long period of exploration and practice, a relatively complete system of analysis of TCM has been formed. It is difficult for a single analytical technique to measure the component information to fully reflect the information of TCM and give a comprehensive evaluation, which can be effectively solved by fusing the information with data [3]. Multiple instrumental analytical data often provide complementary information, and data fusion strategies can combine these data with chemometric methods to obtain more comprehensive information and better classification and prediction results [4]. This paper summarizes the application of data fusion strategies in TCM by citing the literature published from 2010 to 2023 and providing an outlook on its future development.

2. Data Fusion Technology

2.1. Introduction to Data Fusion Technology

Data fusion techniques are methods of joining multiple data sources to obtain more accurate and comprehensive information. Its advantage is that the data can be separated from the original analysis techniques and other independent data to build a new model and use it for the calculation and speculation of the results [5], which can be applied qualitatively and quantitatively. As a technology with more accurate judgment and higher precision, data fusion is commonly used in automation [6], the mining industry [7], aerospace [8,9,10], military technology [11], agriculture [12], and medical science [13], etc. In the face of high-precision testing needs, data fusion always achieves good results. Ji et al. [14] used fluorescence lifetime imaging microscopy (FLIM) to image exfoliated cervical cells and analyzed the images through an unsupervised machine learning method to build a model for predicting the risk of cervical cancer. The sensitivity and specificity of the model were 91% and 100%, respectively, which were significantly higher than that of the traditional cytological methods, demonstrating that data fusion technology can be used as an effective means of cervical cancer screening and detection. In recent years, the application of data fusion technology has gradually become widespread. In addition to its application in scientific and technological disciplines, it has also gradually begun to be applied to the food industry. Obisesan et al. [15] used HPLC equipped with two detectors, ultraviolet (UV) and charged aerosol (CAD), to identify the origin of palm oil, fused the data measured by the two detectors, and the fusion was able to better validate the geographic origin of the palm oil compared to the results of modeling the classification established by a single technique, which proves the feasibility of the data fusion technology.

2.2. Data Fusion Strategy

Based on the data fusion structure the fusion strategies can be categorized into Low-level, Mid-level, and High-level data fusion [16], schemes of which are shown in Figure 1.
In low-level data fusion, also known as data-level fusion, all data are simply stitched together in a sample series into a single matrix whose rows are the number of samples and columns are the number of variables measured by different techniques [17]. Because of its characteristics of large sample size, computational inefficiency, and inability to perform direct fusion for data with different magnitudes, low-level fusion is subject to pre-processing and variable screening, which usually requires pre-processing of the data such as normalization, noise reduction, and deletion of redundant variables before fusion [18]. Low-level fusion is easier to operate than mid- and high-level fusion and can obtain comprehensive information. It is simple to operate, has high detection accuracy, and is suitable for samples with small amounts of data information, e.g., low-level data fusion can better recognize different types of fruit juices [19], identify the authenticity of olive oil as well as origin traceability [20,21].
Mid-level fusion, also known as feature-level fusion, first extracts feature information from each data source, and then analyzes and processes the feature information of each part in tandem, usually using principal component analysis (PCA) or partial least squares discriminant analysis (PLS-DA) [22]. The advantages are that it reduces the amount of data, eliminates irrelevant variables, and improves the computational efficiency, so in most cases, the fusion effect is better than the low-level data fusion [23], which is also the most used fusion method in data fusion strategies. Mid-level fusion feature information should be considered when dealing with data information of high complexity to reduce variables and shorten the calculation time. The mid-level fusion technique is also commonly used in the quality evaluation and origin traceability of food products [24,25], where spectroscopy and mass spectrometry data are fused for analysis [26].
High-level data fusion, also known as decision-level fusion, establishes classification and prediction models for data from different sources separately, and then integrates the results of multiple analyses to obtain the results of joint decision-making. The advantages are strong anti-interference and good fault tolerance; the disadvantages are high processing cost, cumbersome operation, and loss of a large amount of detailed information [25]. High-level data fusion usually includes low-level fusion and mid-level fusion of data, and the key to fusion is to choose the appropriate model. The commonly used models include the Bayesian model, voting mechanism [27], and fuzzy set theory [28]. Li et al. [29] used mid-infrared (MIR) and Raman spectroscopy to determine the content of high fructose syrup (HFGS) in perch honey mixtures and established a partial least squares (PLS) model to predict the concentration of adulterants and, at the same time, adopted three data fusion strategies, low, mid, and high, to improve the prediction accuracy of the quantitative model. The complementary advantages of MIR and Raman spectroscopy indicate that the high-level fusion strategy can be used as a reliable tool for quantitative analysis. The construction of high-level data fusion is more complicated, but the benefit lies in its separate treatment of data blocks, which reduces the mutual interference between different models. High-level fusion strategies can solve the problem of low accuracy of low-level and mid-level fusion and are suitable for analysis scenarios that require higher accuracy. They are commonly used to solve classification problems, such as the classification of hyperspectral images [30], the classification of wild and cultivated mushrooms [28], and the classification of wine varieties [31].
Data fusion techniques can combine information from multiple sources to obtain more comprehensive data through complementary information. The results of research in several fields have shown that models built using data fusion strategies embody more accurate detection and utility than models built with data from a single source [32,33,34,35].

2.3. Pre-Processing

Data pre-processing strategy is the process of preparing raw data and making it suitable for machine learning models. For better multivariate analysis, data from a single source needs to be preprocessed to properly scale the data, to reduce noise, and remove uninformative systematic variations. Common data pre-processing includes standard normal variate (SNV), multiplicative scatter correction (MSC), Savitzky-Golay (SG), Wavelet Denoising (WDS), baseline, normalization, and noise, etc. MSC eliminates scattering due to uneven particle distribution and size. The derivative algorithm eliminates interference from baseline drift or background, separates overlapping peaks, and improves resolution and sensitivity [36]. Smoothing can be used to reduce random errors [37]. Indeed, different data sources require special treatment according to characteristics. Spectrum pre-processing is an essential component in the near-infrared (NIR) calibration. SG [38] and multiplicative MSC are worth testing first, likewise, standard normal SNV was used for mid-infrared (MIR) data [39]. Meanwhile, baseline shifts in infrared spectroscopy can be eliminated with derivatives [40,41]. UV spectroscopy often uses baseline corrections [42], while mass spectrometry uses normalization [41]. In addition, due to the complex sample matrix, scaling of the fused matrix is often used as a data pre-processing method to construct suitable models [42].

3. Traditional Chinese Medicine and Data Analysis

3.1. Problems and Challenges Facing Traditional Chinese Medicine

TCM has a long history of being used for the prevention and treatment of diseases, and as early as thousands of years ago, there was a work, Shennong Ben Cao Jing (Classic of the Materia Medica of the Divine Husbandman), which systematically described the theories, formulas, and functions of TCM [43]. With people’s emphasis on health and the booming development of TCM-related industries, TCM has become a part of the global healthcare system [44], however, with the increasing demand for TCM, more and more problems have been exposed. TCM came from nature: the composition is complex, the mechanism of action is not clear, the safety and efficacy evaluation system is not yet perfect, and the evaluation data are not enough to be recognized by the modern regulatory agencies in Western countries [45]. In the production of TCM preparations, the quality cannot be controlled online, because changes in parameters in the real-time process are not detected in time, and the process parameters set in advance are only applied for the application.
Regardless of the system of medicine, consistent quality of drugs is a prerequisite for stable efficacy, so there is an urgent need for a means to accurately test and analyze the application of TCM in various fields.

3.2. Data Analysis Techniques and Traditional Chinese Medicine

With the rapid development of modern analytical and computer technologies, different analytical strategies and methods have also been widely adopted in TCM. Equally, some means of modern data analysis techniques have also been used in the field of TCM, such as chemometrics [46,47], fingerprinting [48,49], metabolomics [50], machine learning [46,51,52], and deep learning [53,54,55].
Chemometrics is an emerging interdisciplinary discipline based on computers and mathematics, which establishes a link between the measured value of a chemical system and the state of the system through statistical or mathematical methods, and has been widely used in the identification [56], origin traceability [57], qualitative characterization [58], quality evaluation [59] and other research in TCM, as well as the entire process of TCM production, including the design of the preparation process and optimization [60], online monitoring [61], and so on. Pattern recognition is an important part of chemometrics, including traditional recognition modes (HCA, PCA, PLS-DA, etc.) and emerging recognition modes (SVM, ANNs, etc.), usually used with fingerprinting, metabolomics, and data fusion.
The technique of chromatography fingerprinting of TCM has been proven to be a comprehensive strategy for evaluating the complete quality of TCM, which can comprehensively reflect the types and quantities of chemical components contained in TCM and its preparations, as well as describe and evaluate the quality of medicines as a whole [50], which is in line with the holistic theory of TCM. Han et al. [62], to better control the quality of Flos Puerariae (saffron), used chemical fingerprinting and chemometric methods for qualitative and quantitative analysis. First, High Performance Liquid Chromatography (HPLC) was used to obtain its fingerprint, followed by a screening of chemical markers by hierarchical cluster analysis (HCA), principal component analysis (PCA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Subsequently, the chemical constituents in Flos Puerariae (saffron) were identified using HPLC with HPLC-FT-ICR-MS. Finally, HPLC was used to quantify the characterized components in Flos Puerariae. The results showed that six peaks in the fingerprint profile were considered as qualitative markers, and their contents were determined separately. In conclusion, fingerprint profiling combined with chemometric methods can discover chemical markers, and on this basis, if further spectroscopic studies are carried out, the quality of TCM can be combined with its therapeutic efficacy, which can help to elucidate the mechanism of action of TCM.
Metabolomics, along with genomics, transcriptomics, and proteomics, forms the basis of systems biology, and in conjunction with chemometrics techniques, can generate high-quality data and identify potential biomarkers [47]. Metabolomics aims to provide a comprehensive metabolomic characterization of system-wide metabolic changes induced by holistic environmental interventions, as well as a comprehensive metabolomic characterization of complex biological systems [63], which is consistent with the holistic view of TCM and therefore is widely used in assessing the quality of TCM, clinical efficacy, and the essence of TCM syndrome. Yi et al. [64] combined GC-MS with chemometrics to analyze the metabolomics of Chenpi and Qingpi. Firstly, a total of 24 samples of Chenpi at the first fruiting stage and fully ripened stage were analyzed for volatile constituents by gas chromatography-mass spectrometry (GC-MS), and the problem of overlapping some of the peaks was resolved using alternatively moving window factor analysis (AMWFA). Finally, the qualitative and quantitative results of 82 volatile components were also obtained. In addition, the metabolic footprints during fruit ripening were revealed with the help of PCA, thermogram analysis, and Pearson’s correlation analysis, and the characteristic compounds representing three different stages of ripening were screened out. Therefore, the combination of metabolomics analysis and chemometrics will be an effective strategy for the quality control of proximate TCM.
The deep learning method is one of the machine learning methods that can actively screen the data and extract high-dimensional features of the data. Additionally, the prediction and classification accuracy of the data is higher than traditional machine learning [61], Kabir et al. [47] investigated hyperspectral imaging combined with the CNN network to recognize the Fritillaria thunbergii varieties, and the performance of the two models, PLS-DA and SVM, was compared with CNN. The results showed that CNN had the highest accuracy on both training and test sets, 98.88% and 88.89%, respectively, followed by 92.59% and 81.94% for PLS-DA and 99.65% and 79.17% for SVM. This study demonstrated that hyperspectral imaging combined with deep learning can quickly, nondestructively, and accurately identify Fritillaria thunbergii varieties. However, multi-source information often contains a lot of redundant information, which can be feature extracted using a data fusion strategy; therefore, combining data fusion strategies from multi-source instruments and deep learning for traditional Chinese medicine is a new direction and technique for future research.
Data analysis technology has mined more useful information for TCM through analyzing TCM from various aspects and angles. Data fusion technology is also one of the data analyses that can provide more accurate sample information and better inference. Additionally, it has become gradually better known and widely used in the field of TCM in recent years due to its high accuracy and complementary information.

3.3. Model Validation

Data fusion is a powerful approach which provides more comprehensive information. However, as the sample information continues to increase, the model complexity and untrained parameters will continue to expand. At this point, the model is highly susceptible to overfitting, which is characterized by a model that has less loss and higher prediction accuracy on the training data, but the model’s loss on the test set will be large and less accurate. This happens in many studies, so proper validation techniques for these models are essential to avoid overfitting. Commonly used validation methods are internal cross-validation or external dataset testing [65].
What the cross-validation method does is try to do multiple different sets of training/testing of the model using different training/testing set divisions to deal with the problem of overly one-sided word test results and insufficient training data. The training set is the dataset used to train the parameters within the model, and the Classifier adjusts itself directly to the training set to get better classification results. The validation set is used to check the state of the model during the training process. After the performance of the validation set is stabilized, if the training is continued, the performance of the training set will continue to rise, but the validation set will fall instead of rising: overfitting generally occurs, so the validation set is also used to determine when to stop training. The external test set is used to evaluate the model generalization ability, and thus determine whether this model works or not. Around 77% of the research articles used external datasets to test the performance of the developed models, while the rest relied on cross-validation procedures. Testing the predictive ability of a model on external datasets is crucial for performance evaluation [66].
Since multivariate analysis is scale-dependent, data from single techniques are usually preprocessed to properly scale the data. It is also worth noting that special attention must be paid during validation to justify the appropriate selection of features such as the number of PCs to avoid the over-/under- fitting of the models [66]. If the model is too simple with few parameters, the deviation between the predicted values and the correct values we try to predict will be high, resulting in an underfitting phenomenon. On the contrary, if the model has a large number of parameters to respond to all the information from the data source, an overfitting phenomenon occurs that performs well on the training set but has a high error rate on the test set [67]. This remains a problem for all datasets: that is, the balance between data overfitting and underfitting. So, when using data fusion techniques, it is important to pay special attention to the problem of overfitting and underfitting of the model after fusion.

4. Application of Data Fusion Technology in Traditional Chinese Medicine

4.1. Data Fusion for Different Spectroscopy Data

Spectroscopy is one of the most widely used analytical instruments in TCM. In general, spectroscopy data is a matrix of M × N, in which M represents a row of samples and N represents a column of variables. The matrix of M × N is the input variable, while the output variable can be many, such as content, classification status, etc. There are common spectroscopic techniques such as ultraviolet-visible spectroscopy (UV-VIS), Raman Spectroscopy, near-infrared spectroscopy (NIR), and mid-infrared spectroscopy (MIR), etc. The main advantages of spectroscopy are simple pre-processing, simple analysis, low cost, and the ability to perform real-time control. Data fusion between different kinds of spectra is widely used in TCM. Based on the complementary nature of different spectral information, the fusion strategy can obtain more comprehensive data.

4.1.1. Data Fusion for Infrared Spectroscopy Data

The most common is the data fusion between MIR and NIR, which can quickly and accurately identify the sample information as well as determine the species and content of organic matter based on the sample’s characteristic peaks and specific absorption spectroscopy [68]. Fu et al. [69] used NIR and MIR combined with chemometrics to identify Angelica dahurica from different origins, established a supervised pattern based on the PLS-DA algorithm, and were able to accurately identify the geographic origin of Angelica dahurica. Pei et al. [70] use FT-MIR and NIR spectroscopy combined with low, mid, and high-level data fusion, based on software such as OMNIC 9.7.7, SIMCA-P+13.0, MATLAB R2017a, and other software to carry out data processing and modeling analysis, which successfully realized the origin traceability of 196 species of Wild Paris polyphylla var. yunnanensis through the establishment of PLS-DA and a random forest (RF) classification model, in which the accuracy of the model origin identification established by the high-level data fusion of the extraction of the principal component feature variables was 100%. This proves the feasibility of the 2 types of spectroscopy fusion for the accurate origin traceability of Wild Paris polyphylla var. yunnanensis, and the flow of data fusion is shown in Figure 2. To ensure the safety of the clinical medication, Sun et al. [71] used NIR and infrared spectroscopy (IR) to differentiate between official and unofficial rhubarb, and they compared four classification methods, namely PLS-DA, SIMCA, SVM, and ANN, and the results showed that fusion using the ANN model resulted in 100% classification accuracy. Qi et al. [72] assessed the quality of Rhizoma Coptidis from different origins based on quantitative and qualitative metabolic spectroscopy obtained by HPLC in combination with FT-NIR and FT-MIR. First, eight alkaloids were determined by HPLC to screen out the index compounds that could be used as indicator compounds for the quality identification of Huanglian, and then a single-technique model and a spectroscopy fusion model were established concerning the content data of the index compounds. The results showed that the fusion of spectroscopy data with the partial least squares regression (PLSR) algorithm could effectively determine the content of the index compounds and achieve better prediction results, and the fusion technique could provide a more comprehensive and effective quality assessment of Rhizoma Coptidis. Zhang et al. [73] studied the extraction process of Xiao’er Xiaoji Zhike Oral Liquid (XXZOL) by first determining the concentrations of seven key quality attributes during the extraction process, and then establishing low-level and mid-level fusion models based on two types of spectroscopy data in the NIR and MIR with PLS. The results of MATLAB 2019a software showed that the prediction of the data fusion model was better than that of the spectroscopy model for seven key quality attributes, which led to faster, more comprehensive, and more complete detection of information in the extraction process. Hai et al. [74] collected spectroscopy information of 320 Dendrobium huoshanense (DHS) using nano effects near and mid-infrared, extracted their eigenvariables, and performed mid-level data fusion. The resulting eigenvector projection (VIP) was combined with PLS-DA and orthogonal partial least squares discriminant analysis OPLS-DA to determine the age of the DHS with a discrimination accuracy of 100%. This study demonstrated that the method of multi-spectroscopy eigenvector fusion is a reliable analytical method to accurately identify the cultivation age of TCM. Wang et al. [75] employed a low, mid, and high-level fusion strategy using FT-NIR and attenuated total reflectance Fourier transform mid-infrared spectroscopy (ATR-FT-MIR, ATR-FT-MIR) combined with chemometrics for the TCM Eucommia ulmoides leaves from 13 provinces in China to geographic traceability. The results showed that the identification results of the high-level data fusion strategy were better than those of the low and mid-level fusion strategies, probably because the high-level fusion which occurred at the decision level and could weaken the influence of irrelevant or interfering information. Li et al. [76] identified the geographic sources of Panax notoginseng from five regions in Yunnan Province using low-level, mid and high-level fusion strategies by using FT-MIR and NIR junction RF algorithms, and then fused the results of the decisions using SVMs. The results showed that the mid and high-level data fusion strategies provided better identification results than the low-level fusion strategy, probably because the low-level fusion contained too much useless raw information. Similarly, Yang et al. [77] constructed a classification model for Panax notoginseng with different mixing ratios by using analysis software such as Unscrambler 10.4, OriginPro 9.0, and MATLAB 2018a to perform low, mid, and high-level data fusion using NIR and MIR of Panax notoginseng doped samples of different grades. The results showed that the application of the data fusion technique successfully improved the identification of adulterants in Panax notoginseng powder compared with the results of the single-spectrum analysis.

4.1.2. Data Fusion for Other Spectroscopy Data

Li et al. [78] geographically traced Atractylodes macrocephala Koidz to the low-level and mid-level fusion strategy based on chemometrics classification combined with fluorescence and Raman spectroscopy techniques and established principal component analysis-linear discriminant analysis (PCA-LDA), PLS-DA and RF classification. The results showed that the correct classification rate of PLS-DA based on low-level data fusion could reach 91.0%. Wang et al. [79] used laser-induced breakdown spectroscopy (LIBS) and IR combined with RF to identify Radix Astragali from different geographic regions based on low-level fusion, and unsupervised discriminative model using PCA to identify Radix Astragali, the results showed that three Radix Astragali samples could not be accurately differentiated by PCA, and the mid-level data fusion was based on the low-level fusion data, the feature variables were extracted by variable importance (VI) metrics, and the feature values were used as input variables to construct RF discriminant models. The results showed that the RF model of mid-level data fusion had the best prediction performance with an accuracy of 97.78%. Jiang et al. [80] used NIR and ultraviolet spectroscopy (UV) combined with PLSR to establish low-level data fusion and mid-level data fusion models for the simultaneous determination of six ginsenosides and four saccharides in Shenmai injections, and the results showed that the data fusion strategy under the synergistic effect of the complementary information of the two spectroscopy improved the efficiency of the quantitative analysis of ginseng saponins and saccharides in a better way compared to the modeling established by independent NIR or UV spectroscopy.
These studies have demonstrated that data fusion between spectroscopy can improve the accuracy of predictive models, and the method of fusing data from different spectroscopy has been very common in the field of TCM, especially the fusion between NIR and MIR data. IR spectroscopy can also be fused with inductively coupled plasma-atomic emission spectroscopy (ICP-AES) for the determination of trace and mineral elements in TCM [81]. It has been proved that integrating and combining spectroscopy and chemometrics is very feasible in the field of TCM.

4.2. Data Fusion for Spectroscopy Data and Other Techniques Data

Different types of origin traceability techniques have their own advantages, and after fusion, complementary information can be obtained from various aspects. Therefore, in addition to the fusion of different spectral data, spectra can also be fused with mass spectrometry and chromatography techniques. Thus, in addition to the mutual fusion of different spectral data, spectra can also be fused with mass spectrometry and chromatography. Table 1 lists some applications of such fusion in recent years.
Commonly, there is the fusion of chromatography and spectroscopy. Chromatography has a good separation function and can analyze the sample quantitatively, while spectroscopy can suggest the different chemical groups contained in the sample and can perform qualitative analysis. Combining the two can complement each other’s information and provide a comprehensive analysis of TCM with complex compositions. Yan et al. [82] used a combination of UV spectroscopy fingerprinting and HPLC fingerprinting combined with antioxidant activity determination and analyzed the results of the data based on the fingerprint super-informative feature digital evaluation system and SIMCA software to qualitatively and quantitatively evaluate the quality consistency of 26 batches of Liuwei Dihuang Pill, respectively. The two types of fingerprints reflected the quality of the drug from different perspectives, and the complementary information provided a rapid, effective, and accurate method for the evaluation of Chinese medicinal preparations. Yu et al. [83] used the fusion of spectroscopy and chromatography data to identify swertia leducii and its closely relatives, collected Fourier Transform Infrared Spectroscopy (FTIR) data and HPLC fingerprinting data of 102 samples and preprocessed them accordingly. Then low-level and mid-level fusion strategies were used to establish a RF discrimination model of spectroscopy and chromatography fingerprinting data, and the results showed that mid-level fusion of the data was most effective, and all the samples were correctly categorized, which demonstrated that the model constructed was able to discriminate the different genera of swertia leducii well. Wu et al. [84] used ultra-performance liquid chromatography (UPLC) to determine the content of components on a heavy floor, fused the results with the data collected by FTIR spectroscopy at low and mid-level levels, and simultaneously established PLS-DA, SVM, and RF Recognition mode models for the origin traceability of heavy floor plants of the Paris species. The results showed that the mid-level fusion model accuracy of PLS-DA was superior to that of the low data fusion model and the other models, which proved that the mid-level data fusion in conjunction with chemometrics methods could correctly identify different Paris species and could trace their geographic origins.
Mass spectrometry can determine the molecular mass and structural formula, which has the characteristics of rapidity, high sensitivity, and a wider range of use [85], the combination of mass spectrometry data with spectroscopy data can expand the scope of analysis, analyzing the complex composition of TCM compound formulas, and obtain the information of unknown TCM. Dai et al. [86] fumigated four geographically different sources of Ophiopogon japonicus Ker-Gawler (Liliaceae) and analyzed them metabolically by using NIR and UHPLC-LTQ-Orbitrap mass spectrometry. The effect of data fusion on the spatial distribution of the samples was investigated by fusing the NIR spectroscopy data with the information of the samples collected by UHPLC-HRMS. Firstly, based on the PCA, the discriminatory power of each technique was examined separately, and secondly, the NIR and UHPLC-HRMS datasets were compared for their classification results using mid-level fusion without variable selection and mid-level fusion with variable selection, respectively. The results showed that the model discriminatory power was significantly improved after data fusion. Meanwhile, Song et al. [87] used a data fusion technique to predict the antioxidant capacity and total phenolic content of bearberry leaves. Firstly, the spectroscopy fingerprints of bearberry leaves were obtained using UV-visible spectroscopy, and secondly, the metabolomics analyses were performed using ultra-high-performance liquid chromatography with time-of-flight mass spectrometry (UHPLC-Q-TOF-MS), and mid-level fusion of spectroscopy and mass spectrometry data was performed by using PLSR model. The results showed that the RPDs value and DPPH of this method were 6.258 and 6.699, respectively, which proved that the method had the excellent predictive ability.
Table 1. Applications of spectral and other data fusion.
Table 1. Applications of spectral and other data fusion.
TCMAnalytical TechniquesChemometricsFusion LevelRef.
Gentiana rigescensIR, UVPLS-DA, SVMmid[88]
Leccinum rugosicepsIR, UVPLS-DA, SVMmid[89]
Chinese herbal
injection
NIRS, UVSPLS, UVPLSlow, mid[80]
cultivated Macrohyporia cocosATR-FTIR,
UFLC
PLS-DA,
PLSR
-[90]
Wolfiporia
cocos
FT-NIR, FT-MIRPLS-DA, SVM
PCA, HCA
low, mid[91]
wildParis polyphylla Smith var. yunnanensisFT-MIR, UV-VisSVM-GS, RFlow, mid, high[92]
Polygonatum kingianumATR-FTIR, UV-VisPLS-DA,
PCA, HCA
Low, mid, high[93]
Dendrobium huoshanensenano-effect NIR
nano-effect MIR
PLS-DA,
OPLS-DA
mid[75]
Radix puerariaeNIR, UVPLSRlow[94]
Lonicera japonica and
Artemisia annua
NIR, MIRC-PLS, SO-PLS-[95]
Dendrobium SpeciesNIR, UVSVM, PLS-DAlow, mid, high[96]
Coptidis RhizomaNIR, MIRPLS-DA, PLSRlow, mid, high[97]
Radix AstragaliVis-NIR, NIRPCA, CNN,
SVM, LR
-[98]
Boletus bainiuganFT-NIR, FT-MIRPLS-DA, SVM
3DCOS-ResNet
low, mid[99]
HoneyGas sensors,
Liquid sensors
LDAlow[100]
Macrohyporia cocosFTIR, LCPLS-DAlow, high[101]
Macrohyporia cocosFTIR, HPLCPLS-DA, PCAmid[102]
Swertia leduciiFTIR, UPLCHCA, RFlow, mid[83]
sulfur-Ophiopogonis RadixNIR, UHPLC-HRMSPCA, PLS-LDAmid[86]
Curcumae
Rhizoma
FT-NIR, E-nose,
colormeter
PLS-DAmid[103]

4.3. Data Fusion for Mass Spectrometry and Chromatography

4.3.1. Data Fusion for Mass Spectrometry Data

The integration of mass spectrometry in TCM has fewer applications, probably because mass spectrometry is seldom used independently, and most of them are used in conjunction with chromatography, such as gas chromatography-mass spectrometry (GC-MS), liquid chromatography-mass spectrometry (LC-MS), and so on. Massaro et al. [104] used real-time high-resolution mass spectrometry (DART-HRMS) with two modes of positive and negative ions to collect data on genuine and counterfeit Oregano, and mid-level data fusion coupled with supervised chemometric modeling was used to achieve discrimination between genuine and counterfeit Oregano, with a sensitivity and specificity of more than 90%. Zhang et al. [105] used headspace gas chromatography-mass spectrometry (HSGC-MS) and ultra-high performance liquid chromatography-quadrupole time-of-flight mass spectrometry (UHPLC-QTOF/MS) techniques to analyze metabolites of five chrysanthemum flower near-relatives; HSGC-MS was used for volatile metabolites, and as a complementary approach, UHPLC-MS was used to efficiently characterize the nonvolatile, and the fusion of the data from the two mass spectrometry and spectroscopy provided comprehensive and complementary information for the chrysanthemum flower metabolomics study.

4.3.2. Data Fusion for Chromatography Data

Chromatography is the only common analytical instrument with a separation function, which can qualitatively and quantitatively analyze the separated components according to different chemical or physical properties. It has the advantages of high selectivity, high sensitivity, and high separation capacity, which is suitable for the identification of compounds, and when equipped with different detectors can obtain different analysis results [106]. Chromatography fingerprinting is also commonly used in TCM to obtain characteristic chromatograms of compounds, and the fusion of fingerprint data at different wavelengths allows for the creation of a new chromatogram as a way to retain more chemical information [107]. Shen et al. [108] traced the origin of the herbs according to the growing environment and latitude and longitude of Gentiana Rigescens, using HPLC combined with a diode array detector (DAD) to generate chromatography fingerprints of the aboveground and underground parts of Gentiana Rigescens, and used two supervised identification techniques, RF and OPLS-DA, to establish a classification model, which showed that the classification effect of OPLS-DA is better than that of RF. The accuracy of the classification model constructed by utilizing the low-level data fusion method is more than 95%, which has better recognition and prediction ability.
Currently, there are relatively few studies on the fusion of mass spectrometry and chromatography data in TCM, because when the intensity of the chromatography peaks is not of the same magnitude, different degrees of quality deviation will occur, and the contamination of the mass spectrometry instrument can lead to intensity drift. The change or aging of the chromatography column can lead to the drift of retention time, which makes the raw signal produced by the instrument not stable enough to maintain the model’s quality, so there is a need to develop some data pre-processing and other mathematical methods [109,110]. However, mass spectrometry can determine the molecular mass and structural formulae, and chromatography has high separation efficiency and fast analytical speed. Therefore, the fusion of the two is very effective and has been widely used in food [111,112] and other fields.

4.4. Data Fusion for Sensor Data

The traditional evaluation of TCM mainly relies on human taste, vision, smell, and touch, yet this reliance on experience is highly subjective and poorly reproducible. Sensor technology, as a newly developed detection device, obtains information quickly and accurately, and the fusion of data from different sensors can make up for the inadequacy of information from a single sensor [113]. It has been widely used in the fields of computers [114], agriculture [115], transportation [116], medicine [117], and so on. With the development of technology, in addition to the mutual fusion between sensor technologies, there is also the fusion of sensor technology with other technologies, which has been widely used in the identification of varieties, source identification, and quality evaluation for TCM [118,119]. Zhang et al. [120] used electronic nose (e-nose) and electronic tongue (e-tongue) data combined with chemometrics to qualitatively identify and quantify the quality of Xiaochaihu granules (XCHG). Firstly, the main chemical components such as saikosaponin b2, baicalin, and glycyrrhizin were quantified by UHPLC. Secondly, the data of odor, color, and taste of XCHG were measured by the e-nose, e-eye, and e-tongue. Finally, the sensor data were fused with the chromatography data, and the predictive model for the content constituents was established using PLSR. Wang et al. [121] used two sensor devices, an e-nose, and an e-tongue, to establish a discriminative model of Codonopsis Radix, selecting the optimal model from all combinations of models trained by the two pre-processing methods and the three classification methods, and the results showed that the PLS-DA could well discriminate the original plant source of Codonopsis Radix after modeling. Miao et al. [122] proposed a method for the fusion of e-nose and NIR data combined with SVM used for the identification of different species of ginsengs. Six commonly used features were extracted from the e-nose, and then the feature data of NIR were extracted by PCA. Finally, the models were constructed and trained using a support vector machine (SVM). The results showed that the high-level data fusion method combined with SVM achieved a good classification performance of 99.24%. Jing et al. [123] used an e-nose and e-tongue to measure the characteristic odors of two different species of Magnolia Officinalis Cortex, used UHPLC to quantify eight active components in Magnolia Officinalis Cortex, used Origin Pro2021b software to analyze the results with chemometrics, such as PCA, HCA, PLS-DA, LDA, etc., and used the low and mid-level data fusion to establish a classification model, respectively, in which the model discrimination accuracy of the mid-level fusion strategy reached 100%. The results demonstrated that this method is a powerful tool to differentiate between the two types of Magnolia Officinalis Cortex sources.

5. Comprehensive Evaluation of Data Fusion

In summary, data fusion technology has obvious advantages when facing the needs of multi-sample and high-precision detection, and relying on the complementarity as well as synergy of information, data fusion technology can better respond to the overall information of TCM, which is in line with the “holistic view” of TCM. However, there are some limitations in the choice of fusion strategy.
Data fusion is not the pursuit of high-level, high-level data fusion can meet the requirements of high-precision detection, but its operation is cumbersome, and it may not necessarily achieve better results than low-level or mid-level fusion [101]. Low-level data fusion is simple to operate, more rapid and convenient than mid-level and high-level data fusion techniques, and its advantage lies in the complementary nature of various types of data, which is suitable for samples with a small amount of information data and low detection requirements. However, it contains too much redundancy, interference, noise, and irrelevant information, and when the amount of information data is huge, the low-level data fusion consumes a lot of time, and it is difficult to meet the detection requirements. At this time, it is necessary to consider the application of the mid-level or high-level data fusion strategy. For data information with high complexity, mid-level data fusion should be used to find out the feature information in the data to reduce the variables and reduce the computation time, mid-level data fusion eliminates irrelevant variables and improves the computational efficiency, so this strategy is one of the most used methods in data fusion [124], but we should pay attention to the problem of the decrease in accuracy after mid-level fusion. The high-level fusion method has high accuracy and can solve the problem of accuracy degradation due to the reduction of variables in mid-level fusion, but the exploration of its decision layer consumes a lot of time and the model construction is more complicated, so it is only applicable to scenarios with a large sample size and the need for real-time high-precision analysis. The application of different fusion strategies does not have a unified standard, so in practice, data of different natures should be analyzed by multiple levels of fusion strategies, and then filter out the optimal data fusion method.
In terms of model selection, the commonly used models for classification are PLS-DA, SVM, RF, etc., and the commonly used regression models are PCR, PLSR, SVMR, etc., which can be used to select the appropriate model according to the specific research purpose. It is worth mentioning that the latest deep learning algorithms have far exceeded the traditional machine learning algorithms for data prediction and classification accuracy, deep learning does not require us to extract the features, but will automatically screen the data and extract the data high-dimensional features, so with the development of computer technology, data fusion technology combined with deep learning will become a new development trend in the field of TCM.

6. Conclusions and Future Outlooks

Obviously, data fusion technology, by the complementary and synergistic characteristics of information, has had an important impact on the authenticity, safety, and effectiveness of TCM, and has great potential for development in the field of TCM. With the improvement of data fusion technology and the development of the discipline of TCM, we should focus on the use of data fusion technology to break through the detection difficulties in TCM, combining the practical and theoretical, such as data fusion technology being used to simulate the growth environment, temperature, humidity, etc. To cultivate the best quality herbs, a suitable model of traceability and categorization of the origin of TCM is established to ensure the authenticity. Monitoring pesticide residues, heavy metals, and other harmful substances is required to ensure the safety and enhance the quality of TCM. In the production of TCM preparations, data fusion technology can be used to collect multi-point data for each production link such as extraction, concentration, and packaging of finished products to find the best process parameters and realize the precise control of the process. In terms of the effectiveness of TCM, data fusion technology can be used to evaluate the biological indexes, as well as quantitative and qualitative testing of the chemical indexes, to reach the safety and effectiveness of the TCM products.
The data fusion technology in the field of TCM is usually the fusion of physical information such as spectroscopy, mass spectroscopy, or sensors. In the future, chemical information or biological information like DNA and gene sequencing can be fused separately or with each other to obtain all the information about TCM in a more comprehensive and precise way. The current data analysis technology is limited to the fusion of two or three data. The fusion of four, five, or even more data can be tried in the future. The reliability of samples and data is also an issue that needs to be focused on, and the samples should have stronger representativeness and sufficient sample size to enhance the applicability and stability of the model and pave the way for subsequent data fusion. In the future, accurate processing of large and complex data sets will need to rely on other analytical techniques with reliability and specificity for adequate and systematic processing. Therefore, it is expected that data pre-processing techniques will have many innovations in a few years.
However, the data fusion technology itself carries a certain degree of error. If the samples are not correctly categorized, then the predictions will be more confusing. There are no clear limits on what kind of analytical content should be subject to data fusion techniques. For some data with small sample sizes and simple analysis tasks, data fusion undoubtedly adds to the difficulty of the workload. In practice, when data fusion techniques are employed, the preprocessing of information data or model selection is often operated and analyzed according to the thinking of the analysts or procedures set by the software, and there is a lack of standardization. So, in the future this should be combined with the existing research to put forward the application of data fusion strategy based on the standards in TCM.

Author Contributions

Conceptualization, writing: original draft preparation; R.H.; reference retrieval, funding acquisition, writing: review and editing. S.M.; writing: figure drawing, data curation, editing, and proofreading, S.D.; ideas article structure, review compilation, and final editing. J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Training Fund for academic leaders of NIFDC (2021X4).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No data were used for the research described in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, W.-Y.; Xie, Y.; Zhou, H.; Liu, L. Contribution of Traditional Chinese Medicine to the Treatment of COVID-19. Phytomedicine 2021, 85, 153279. [Google Scholar] [CrossRef] [PubMed]
  2. Li, Y.; Xie, Y.; He, Y.; Hou, W.; Liao, M.; Liu, C. Quality Markers of Traditional Chinese Medicine: Concept, Progress, and Perspective. Engineering 2019, 5, 888–894. [Google Scholar] [CrossRef]
  3. Tahir, H.E.; Arslan, M.; Komla Mahunu, G.; Adam Mariod, A.; Hashim, S.; Xiaobo, Z.; Jiyong, S.; El-Seedi, H.R.; Musa, T.H. The Use of Analytical Techniques Coupled with Chemometrics for Tracing the Geographical Origin of Oils: A Systematic Review (2013–2020). Food Chem. 2022, 366, 130633. [Google Scholar] [CrossRef] [PubMed]
  4. Doeswijk, T.G.; Smilde, A.K.; Hageman, J.A.; Westerhuis, J.A.; Van Eeuwijk, F.A. On the Increase of Predictive Performance with High-Level Data Fusion. Anal. Chim. Acta 2011, 705, 41–47. [Google Scholar] [CrossRef] [PubMed]
  5. Ding, R.; Yu, L.; Wang, C.; Zhong, S.; Gu, R. Quality assessment of traditional Chinese medicine based on data fusion combined with machine learning: A review. Crit. Rev. Anal. Chem. 2023, 26, 1–18. [Google Scholar] [CrossRef] [PubMed]
  6. Brierley, N.; Tippetts, T.; Cawley, P. Data Fusion for Automated Non-Destructive Inspection. Proc. R. Soc. A 2014, 470, 20140167. [Google Scholar] [CrossRef]
  7. Lorenz, S.; Seidel, P.; Ghamisi, P.; Zimmermann, R.; Tusa, L.; Khodadadzadeh, M.; Contreras, I.C.; Gloaguen, R. Multi-Sensor Spectral Imaging of Geological Samples: A Data Fusion Approach Using Spatio-Spectral Feature Extraction. Sensors 2019, 19, 2787. [Google Scholar] [CrossRef]
  8. Chen, J.; Li, D.; Huang, R.; Chen, Z.; Li, W. Aero-Engine Remaining Useful Life Prediction Method with Self-Adaptive Multimodal Data Fusion and Cluster-Ensemble Transfer Regression. Reliab. Eng. Syst. Saf. 2023, 234, 109151. [Google Scholar] [CrossRef]
  9. Liu, H.; Gao, C.; Wang, X.; Dou, Z.; Zhang, W. Aeroelastic Prediction in Transonic Buffeting Flow with Data Fusion Method. Aerosp. Sci. Technol. 2023, 133, 108097. [Google Scholar] [CrossRef]
  10. Yan, H.; Zuo, H.; Sun, J.; Zhou, D.; Wang, H. Two-Stage Degradation Assessment and Prediction Method for Aircraft Engine Based on Data Fusion. Int. J. Aerosp. Eng. 2021, 2021, 1–16. [Google Scholar] [CrossRef]
  11. Chmielewski, M.; Kukiełka, M.; Pieczonka, P.; Gutowski, T. Methods and Analytical Tools for Assessing Tactical Situation in Military Operations Using Potential Approach and Sensor Data Fusion. Procedia Manuf. 2020, 44, 559–566. [Google Scholar] [CrossRef]
  12. Barrile, V.; Simonetti, S.; Citroni, R.; Fotia, A.; Bilotta, G. Experimenting Agriculture 4.0 with Sensors: A Data Fusion Approach between Remote Sensing, UAVs and Self-Driving Tractors. Sensors 2022, 22, 7910. [Google Scholar] [CrossRef] [PubMed]
  13. Steyaert, S.; Pizurica, M.; Nagaraj, D.; Khandelwal, P.; Hernandez-Boussard, T.; Gentles, A.J.; Gevaert, O. Multimodal Data Fusion for Cancer Biomarker Discovery with Deep Learning. Nat. Mach. Intell. 2023, 5, 351–362. [Google Scholar] [CrossRef] [PubMed]
  14. Ji, M.; Zhong, J.; Xue, R.; Su, W.; Kong, Y.; Fei, Y.; Ma, J.; Wang, Y.; Mi, L. Early Detection of Cervical Cancer by Fluorescence Lifetime Imaging Microscopy Combined with Unsupervised Machine Learning. IJMS 2022, 23, 11476. [Google Scholar] [CrossRef] [PubMed]
  15. Obisesan, K.A.; Jiménez-Carvelo, A.M.; Cuadros-Rodriguez, L.; Ruisánchez, I.; Callao, M.P. HPLC-UV and HPLC-CAD Chromatographic Data Fusion for the Authentication of the Geographical Origin of Palm Oil. Talanta 2017, 170, 413–418. [Google Scholar] [CrossRef] [PubMed]
  16. Azcarate, S.M.; Ríos-Reina, R.; Amigo, J.M.; Goicoechea, H.C. Data Handling in Data Fusion: Methodologies and Applications. TrAC Trends Anal. Chem. 2021, 143, 116355. [Google Scholar] [CrossRef]
  17. Firmani, P.; Vitale, R.; Ruckebusch, C.; Marini, F. ANOVA-Simultaneous Component Analysis Modelling of Low-Level-Fused Spectroscopic Data: A Food Chemistry Case-Study. Anal. Chim. Acta 2020, 1125, 308–314. [Google Scholar] [CrossRef]
  18. Mishra, P.; Biancolillo, A.; Roger, J.M.; Marini, F.; Rutledge, D.N. New Data Preprocessing Trends Based on Ensemble of Multiple Preprocessing Techniques. TrAC Trends Anal. Chem. 2020, 132, 116045. [Google Scholar] [CrossRef]
  19. Haddi, Z.; Mabrouk, S.; Bougrini, M.; Tahri, K.; Sghaier, K.; Barhoumi, H.; El Bari, N.; Maaref, A.; Jaffrezic-Renault, N.; Bouchikhi, B. E-Nose and e-Tongue Combination for Improved Recognition of Fruit Juice Samples. Food Chem. 2014, 150, 246–253. [Google Scholar] [CrossRef]
  20. Haddi, Z.; Boughrini, M.; Ihlou, S.; Amari, A.; Mabrouk, S.; Barhoumi, H.; Maaref, A.; El Bari, N.; Llobet, E.; Jaffrezic-Renault, N.; et al. Geographical Classification of Virgin Olive Oils by Combining the Electronic Nose and Tongue. In Proceedings of the 2012 IEEE Sensors Proceedings, Taipei, Taiwan, 28–31 October 2012; pp. 785–788. [Google Scholar]
  21. Haddi, Z.; Alami, H.; El Bari, N.; Tounsi, M.; Barhoumi, H.; Maaref, A.; Jaffrezic-Renault, N.; Bouchikhi, B. Electronic Nose and Tongue Combination for Improved Classification of Moroccan Virgin Olive Oil Profiles. Food Res. Int. 2013, 54, 1488–1498. [Google Scholar] [CrossRef]
  22. Biancolillo, A.; Bucci, R.; Magrì, A.L.; Magrì, A.D.; Marini, F. Data-Fusion for Multiplatform Characterization of an Italian Craft Beer Aimed at Its Authentication. Anal. Chim. Acta 2014, 820, 23–31. [Google Scholar] [CrossRef]
  23. Li, Y.; Wang, Y. Synergistic Strategy for the Geographical Traceability of Wild Boletus Tomentipes by Means of Data Fusion Analysis. Microchem. J. 2018, 140, 38–46. [Google Scholar] [CrossRef]
  24. Hong, Y.; Birse, N.; Quinn, B.; Li, Y.; Jia, W.; McCarron, P.; Wu, D.; Da Silva, G.R.; Vanhaecke, L.; Van Ruth, S.; et al. Data Fusion and Multivariate Analysis for Food Authenticity Analysis. Nat. Commun. 2023, 14, 3309. [Google Scholar] [CrossRef] [PubMed]
  25. Huang, L.; Zhao, J.; Chen, Q.; Zhang, Y. Nondestructive Measurement of Total Volatile Basic Nitrogen (TVB-N) in Pork Meat by Integrating near Infrared Spectroscopy, Computer Vision and Electronic Nose Techniques. Food Chem. 2014, 145, 228–236. [Google Scholar] [CrossRef] [PubMed]
  26. Lösel, H.; Brockelt, J.; Gärber, F.; Teipel, J.; Kuballa, T.; Seifert, S.; Fischer, M. Comparative Analysis of LC-ESI-IM-qToF-MS and FT-NIR Spectroscopy Approaches for the Authentication of Organic and Conventional Eggs. Metabolites 2023, 13, 882. [Google Scholar] [CrossRef] [PubMed]
  27. Wang, H.-P.; Chen, P.; Dai, J.-W.; Liu, D.; Li, J.-Y.; Xu, Y.-P.; Chu, X.-L. Recent Advances of Chemometric Calibration Methods in Modern Spectroscopy: Algorithms, Strategy, and Related Issues. TrAC Trends Anal. Chem. 2022, 153, 116648. [Google Scholar] [CrossRef]
  28. Yao, S.; Li, J.; Duan, Z.; Li, T.; Wang, Y. Fusion of Ultraviolet and Infrared Spectra Using Support Vector Machine and Random Forest Models for the Discrimination of Wild and Cultivated Mushrooms. Anal. Lett. 2020, 53, 1019–1033. [Google Scholar] [CrossRef]
  29. Li, Y.; Huang, Y.; Xia, J.; Xiong, Y.; Min, S. Quantitative Analysis of Honey Adulteration by Spectrum Analysis Combined with Several High-Level Data Fusion Strategies. Vib. Spectrosc. 2020, 108, 103060. [Google Scholar] [CrossRef]
  30. Liu, J.; Tang, H.; Zheng, G.C.; Li, J.; You, J. GAT and 3D-CNN Based Multi-Feature Fusion High-Spectrum Image Classification Method, Involves Performing Linear Feature Fusion to Advanced Feature Image and Super-Pixel-Level Feature Image, and Outputting Classification Result of Image Data to Be Tested. CN115375941-A.
  31. Ríos-Reina, R.; Azcarate, S.M.; Camiña, J.M.; Goicoechea, H.C. Multi-Level Data Fusion Strategies for Modeling Three-Way Electrophoresis Capillary and Fluorescence Arrays Enhancing Geographical and Grape Variety Classification of Wines. Anal. Chim. Acta 2020, 1126, 52–62. [Google Scholar] [CrossRef]
  32. Xian, G. Multiple Microphone Speaker Recognition System for Second Language Based on Biomimetic Pattern Recognition with Big Data Fusion. IOP Conf. Ser. Mater. Sci. Eng. 2020, 790, 012144. [Google Scholar] [CrossRef]
  33. Fang, J.; Hou, H.; Bi, Z.M.; Jin, D.; Han, L.; Yang, J.; Dai, S. Data Fusion in Forecasting Medical Demands Based on Spectrum of Post-Earthquake Diseases. J. Ind. Inf. Integr. 2021, 24, 100235. [Google Scholar] [CrossRef]
  34. Ihnaini, B.; Khan, M.A.; Khan, T.A.; Abbas, S.; Daoud, M.S.; Ahmad, M.; Khan, M.A. A Smart Healthcare Recommendation System for Multidisciplinary Diabetes Patients with Data Fusion Based on Deep Ensemble Learning. Comput. Intell. Neurosci. 2021, 2021, 1–11. [Google Scholar] [CrossRef] [PubMed]
  35. Zhao, L.; Wang, R.; Wang, J.; Yu, T.; Su, A. Nonlinear State Estimation with Delayed Measurements Using Data Fusion Technique and Cubature Kalman Filter for Chemical Processes. Chem. Eng. Res. Des. 2019, 141, 502–515. [Google Scholar] [CrossRef]
  36. Barnes, R.J.; Dhanoa, M.S.; Lister, S.J. Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra. Appl. Spectrosc. 1989, 43, 772–777. [Google Scholar] [CrossRef]
  37. Steinier, J.; Termonia, Y.; Deltour, J. Smoothing and Differentiation of Data by Simplified Least Square Procedure. Anal. Chem. 1972, 44, 1906–1909. [Google Scholar] [CrossRef] [PubMed]
  38. Jiao, Y.; Li, Z.; Chen, X.; Fei, S. Preprocessing Methods for Near-infrared Spectrum Calibration. J. Chemom. 2020, 34, e3306. [Google Scholar] [CrossRef]
  39. Brás, L.P.; Bernardino, S.A.; Lopes, J.A.; Menezes, J.C. Multiblock PLS as an Approach to Compare and Combine NIR and MIR Spectra in Calibrations of Soybean Flour. Chemom. Intell. Lab. Syst. 2005, 75, 91–99. [Google Scholar] [CrossRef]
  40. Pizarro, C.; Rodríguez-Tecedor, S.; Pérez-del-Notario, N.; Esteban-Díez, I.; González-Sáiz, J.M. Classification of Spanish Extra Virgin Olive Oils by Data Fusion of Visible Spectroscopic Fingerprints and Chemical Descriptors. Food Chem. 2013, 138, 915–922. [Google Scholar] [CrossRef]
  41. Casale, M.; Oliveri, P.; Casolino, C.; Sinelli, N.; Zunin, P.; Armanino, C.; Forina, M.; Lanteri, S. Characterisation of PDO Olive Oil Chianti Classico by Non-Selective (UV–Visible, NIR and MIR Spectroscopy) and Selective (Fatty Acid Composition) Analytical Techniques. Anal. Chim. Acta 2012, 712, 56–63. [Google Scholar] [CrossRef]
  42. Vera, L.; Aceña, L.; Guasch, J.; Boqué, R.; Mestres, M.; Busto, O. Discrimination and Sensory Description of Beers through Data Fusion. Talanta 2011, 87, 136–142. [Google Scholar] [CrossRef] [PubMed]
  43. Kong, D.-X.; Li, X.-J.; Zhang, H.-Y. Where Is the Hope for Drug Discovery? Let History Tell the Future. Drug Discov. Today 2009, 14, 115–119. [Google Scholar] [CrossRef] [PubMed]
  44. Li, Y.; Shen, Y.; Yao, C.; Guo, D. Quality Assessment of Herbal Medicines Based on Chemical Fingerprints Combined with Chemometrics Approach: A Review. J. Pharm. Biomed. Anal. 2020, 185, 113215. [Google Scholar] [CrossRef] [PubMed]
  45. Bansal, A.; Chhabra, V.; Rawal, R.K.; Sharma, S. Chemometrics: A New Scenario in Herbal Drug Standardization. J. Pharm. Anal. 2014, 4, 223–233. [Google Scholar] [CrossRef] [PubMed]
  46. Li, S.; Jiao, C.; Xu, Z.; Wu, Y.; Forsberg, E.; Peng, X.; He, S. Determination of Geographic Origins and Types of Lindera Aggregata Samples Using a Portable Short-Wave Infrared Hyperspectral Imager. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2022, 279, 121370. [Google Scholar] [CrossRef] [PubMed]
  47. Liu, S.; Liang, Y.-Z.; Liu, H. Chemometrics Applied to Quality Control and Metabolomics for Traditional Chinese Medicines. J. Chromatogr. B 2016, 1015–1016, 82–91. [Google Scholar] [CrossRef]
  48. Yu, M.; Xu, G.; Qin, M.; Li, Y.; Guo, Y.; Ma, Q. Multiple Fingerprints and Spectrum-Effect Relationship of Polysaccharides from Saposhnikoviae Radix. Molecules 2022, 27, 5278. [Google Scholar] [CrossRef]
  49. Li, X.; Gong, D.; Sun, G.; Zhang, H.; Sun, W. High Performance Liquid Chromatography Three-Wavelength Fusion Fingerprint Combined with Electrochemical Fingerprint and Antioxidant Method to Evaluate the Quality Consistency of Mingmu Dihuang Pill. J. Chromatogr. A 2022, 1681, 463448. [Google Scholar] [CrossRef]
  50. Liang, Y.-Z.; Xie, P.-S.; Chan, K. Chromatographic Fingerprinting and Metabolomics for Quality Control of TCM. Comb Chem. High Throughput Screen. 2010, 13, 943–953. [Google Scholar] [CrossRef]
  51. Kabir, M.H.; Guindo, M.L.; Chen, R.; Liu, F.; Luo, X.; Kong, W. Deep Learning Combined with Hyperspectral Imaging Technology for Variety Discrimination of Fritillaria Thunbergii. Molecules 2022, 27, 6042. [Google Scholar] [CrossRef]
  52. Jiao, C.; Xu, Z.; Bian, Q.; Forsberg, E.; Tan, Q.; Peng, X.; He, S. Machine Learning Classification of Origins and Varieties of Tetrastigma Hemsleyanum Using a Dual-Mode Microscopic Hyperspectral Imager. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2021, 261, 120054. [Google Scholar] [CrossRef] [PubMed]
  53. Ai, K.; Yuan, D.; Zheng, J. Experimental Research on the Antitumor Effect of Human Gastric Cancer Cells Transplanted in Nude Mice Based on Deep Learning Combined with Spleen-Invigorating Chinese Medicine. Comput. Math. Methods Med. 2022, 2022, 1–11. [Google Scholar] [CrossRef] [PubMed]
  54. Cheng, N.; Chen, Y.; Gao, W.; Liu, J.; Huang, Q.; Yan, C.; Huang, X.; Ding, C. An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification. Front. Genet. 2021, 12, 807825. [Google Scholar] [CrossRef] [PubMed]
  55. Wang, Z.; Li, L.; Song, M.; Yan, J.; Shi, J.; Yao, Y. Evaluating the Traditional Chinese Medicine (TCM) Officially Recommended in China for COVID-19 Using Ontology-Based Side-Effect Prediction Framework (OSPF) and Deep Learning. J. Ethnopharmacol. 2021, 272, 113957. [Google Scholar] [CrossRef] [PubMed]
  56. Tang, J.-F.; Li, W.-X.; Zhang, F.; Li, Y.-H.; Cao, Y.-J.; Zhao, Y.; Li, X.-L.; Ma, Z.-J. Discrimination of Radix Polygoni Multiflori from Different Geographical Areas by UPLC-QTOF/MS Combined with Chemometrics. Chin. Med. 2017, 12, 34. [Google Scholar] [CrossRef] [PubMed]
  57. Fang, H.; Chen, Y.; Wu, H.-L.; Chen, Y.; Wang, T.; Yang, J.; Fu, H.-Y.; Yang, X.-L.; Li, X.-F.; Yu, R.-Q. Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry Combined with Chemometrics to Identify the Origin of Chinese Medicinal Materials. RSC Adv. 2022, 12, 16886–16892. [Google Scholar] [CrossRef]
  58. Tao, L.; Liu, B.; Jin, Y.; Sun, D.; Liu, X.; Chen, Y.; Wu, Y. Characterization of Toad Skin for Traditional Chinese Medicine by Near-Infrared Spectroscopy and Chemometrics. Anal. Lett. 2017, 50, 1292–1306. [Google Scholar] [CrossRef]
  59. Wang, N.; Li, Z.-Y.; Zheng, X.-L.; Li, Q.; Yang, X.; Xu, H. Quality Assessment of Kumu Injection, a Traditional Chinese Medicine Preparation, Using HPLC Combined with Chemometric Methods and Qualitative and Quantitative Analysis of Multiple Alkaloids by Single Marker. Molecules 2018, 23, 856. [Google Scholar] [CrossRef]
  60. Kellogg, J.J.; Wallace, E.D.; Graf, T.N.; Oberlies, N.H.; Cech, N.B. Conventional and Accelerated-Solvent Extractions of Green Tea (Camellia Sinensis) for Metabolomics-Based Chemometrics. J. Pharm. Biomed. Anal. 2017, 145, 604–610. [Google Scholar] [CrossRef]
  61. Liu, R.; Sun, Q.; Hu, T.; Li, L.; Nie, L.; Wang, J.; Zhou, W.; Zang, H. Multi-Parameters Monitoring during Traditional Chinese Medicine Concentration Process with near Infrared Spectroscopy and Chemometrics. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2018, 192, 75–81. [Google Scholar] [CrossRef]
  62. Han, J.; Xu, K.; Yan, Q.; Sui, W.; Zhang, H.; Wang, S.; Zhang, Z.; Wei, Z.; Han, F. Qualitative and Quantitative Evaluation of Flos Puerariae by Using Chemical Fingerprint in Combination with Chemometrics Method. J. Pharm. Anal. 2022, 12, 489–499. [Google Scholar] [CrossRef] [PubMed]
  63. Weckwerth, W. Metabolomics: An Integral Technique in Systems Biology. Bioanalysis 2010, 2, 829–836. [Google Scholar] [CrossRef] [PubMed]
  64. Yi, L.; Dong, N.; Liu, S.; Yi, Z.; Zhang, Y. Chemical Features of Pericarpium Citri Reticulatae and Pericarpium Citri Reticulatae Viride Revealed by GC–MS Metabolomics Analysis. Food Chem. 2015, 186, 192–199. [Google Scholar] [CrossRef] [PubMed]
  65. Charilaou, P.; Battat, R. Machine Learning Models and Over-Fitting Considerations. World J Gastroenterol. 2022, 28, 605–607. [Google Scholar] [CrossRef] [PubMed]
  66. Casian, T.; Nagy, B.; Kovács, B.; Galata, D.L.; Hirsch, E.; Farkas, A. Challenges and Opportunities of Implementing Data Fusion in Process Analytical Technology—A Review. Molecules 2022, 27, 4846. [Google Scholar] [CrossRef] [PubMed]
  67. Belkin, M.; Hsu, D.; Ma, S.; Mandal, S. Reconciling Modern Machine-Learning Practice and the Classical Bias–Variance Trade-Off. Proc. Natl. Acad. Sci. USA 2019, 116, 15849–15854. [Google Scholar] [CrossRef] [PubMed]
  68. Li, F.; Xu, L.; You, T.; Lu, A. Measurement of Potentially Toxic Elements in the Soil through NIR, MIR, and XRF Spectral Data Fusion. Comput. Electron. Agric. 2021, 187, 106257. [Google Scholar] [CrossRef]
  69. Fu, H.; Shi, Q.; Wei, L.; Xu, L.; Guo, X.; Hu, O.; Lan, W.; Xie, S.; Yang, T. Rapid Recognition of Geoherbalism and Authenticity of a Chinese Herb by Data Fusion of Near-Infrared Spectroscopy (NIR) and Mid-Infrared (MIR) Spectroscopy Combined with Chemometrics. J. Spectrosc. 2019, 2019, 1–9. [Google Scholar] [CrossRef]
  70. Pei, Y.; Zuo, Z.; Zhang, Q.; Wang, Y. Data Fusion of Fourier Transform Mid-Infrared (MIR) and Near-Infrared (NIR) Spectroscopies to Identify Geographical Origin of Wild Paris Polyphylla Var. Yunnanensis. Molecules 2019, 24, 2559. [Google Scholar] [CrossRef]
  71. Sun, W.; Zhang, X.; Zhang, Z.; Zhu, R. Data Fusion of Near-Infrared and Mid-Infrared Spectra for Identification of Rhubarb. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2017, 171, 72–79. [Google Scholar] [CrossRef]
  72. Qi, L.; Ma, Y.; Zhong, F.; Shen, C. Comprehensive Quality Assessment for Rhizoma Coptidis Based on Quantitative and Qualitative Metabolic Profiles Using High Performance Liquid Chromatography, Fourier Transform near-Infrared and Fourier Transform Mid-Infrared Combined with Multivariate Statistical Analysis. J. Pharm. Biomed. Anal. 2018, 161, 436–443. [Google Scholar] [CrossRef] [PubMed]
  73. Zhang, J.; Xu, X.; Li, L.; Li, H.; Gao, L.; Yuan, X.; Du, H.; Guan, Y.; Zang, H. Multi Critical Quality Attributes Monitoring of Chinese Oral Liquid Extraction Process with a Spectral Sensor Fusion Strategy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2022, 278, 121317. [Google Scholar] [CrossRef] [PubMed]
  74. Hai, C.; Long, W.; Suo, Y.; Lu, H.; Chen, H.; Yang, X.; Yang, J.; Fu, H. Nano-Effect Multivariate Fusion Spectroscopy Combined with Chemometrics for Accurate Identification the Cultivation Methods and Growth Years of Dendrobium Huoshanense. Microchem. J. 2022, 179, 107556. [Google Scholar] [CrossRef]
  75. Wang, C.-Y.; Tang, L.; Li, L.; Zhou, Q.; Li, Y.-J.; Li, J.; Wang, Y.-Z. Geographic Authentication of Eucommia Ulmoides Leaves Using Multivariate Analysis and Preliminary Study on the Compositional Response to Environment. Front. Plant Sci. 2020, 11, 79. [Google Scholar] [CrossRef] [PubMed]
  76. Li, Y.; Zhang, J.-Y.; Wang, Y.-Z. FT-MIR and NIR Spectral Data Fusion: A Synergetic Strategy for the Geographical Traceability of Panax Notoginseng. Anal. Bioanal. Chem. 2018, 410, 91–103. [Google Scholar] [CrossRef] [PubMed]
  77. Yang, X.; Song, J.; Peng, L.; Gao, L.; Liu, X.; Xie, L.; Li, G. Improving Identification Ability of Adulterants in Powdered Panax Notoginseng Using Particle Swarm Optimization and Data Fusion. Infrared Phys. Technol. 2019, 103, 103101. [Google Scholar] [CrossRef]
  78. Li, M.-X.; Fang, H.; Chen, Y.; Wang, T.; Yang, J.; Fu, H.-Y.; Yang, X.-L.; Li, X.-F.; Chen, Z.-P.; Yu, R.-Q. Data Fusion of Synchronous Fluorescence and Surface Enhanced Raman Scattering Spectroscopies for Geographical Origin Traceability of Atractylodes Macrocephala Koidz. Spectrosc. Lett. 2022, 55, 290–301. [Google Scholar] [CrossRef]
  79. Wang, Y.; Li, M.-G.; Feng, T.; Zhang, T.-L.; Feng, Y.-Q.; Li, H. Discrimination of Radix Astragali According to Geographical Regions by Data Fusion of Laser Induced Breakdown Spectroscopy (LIBS) and Infrared Spectroscopy (IR) Combined with Random Forest (RF). Chin. J. Anal. Chem. 2022, 50, 100057. [Google Scholar] [CrossRef]
  80. Jiang, C.; Liu, Y.; Qu, H. Data Fusion Strategy Based on near Infrared Spectra and Ultraviolet Spectra for Simultaneous Determination of Ginsenosides and Saccharides in Chinese Herbal Injection. Anal. Methods 2013, 5, 4467. [Google Scholar] [CrossRef]
  81. Zhang, J.S.; Wang, Y.M.; Song, X.H.; Sun, L.X.; Sun, Y.Z. A Comparison of Mineral Element Contents Among Different Types of Ginseng Using Inductively Coupled Plasma-Atomic Emission Spectroscopy. Microchem. J. 1994, 50, 131–135. [Google Scholar] [CrossRef]
  82. Yan, B.; Sun, G. Monitoring Quality Consistency of Liuwei Dihuang Pill by Integrating the Ultraviolet Spectroscopic Fingerprint, a Multi-Wavelength Fusion Fingerprint Method, and Antioxidant Activities. J. Sep. Sci. 2018, 41, 1182–1191. [Google Scholar] [CrossRef] [PubMed]
  83. Ye-xia, Y.; Li, L.; Yuan-zhong, W. Study on Differentiation of Swertia Leducii and Its Closely Relative Species Based on Data Fusion of Spectra and Chromatography. Spectrosc. Spectr. Anal. 2020, 40, 2440–2446. [Google Scholar] [CrossRef]
  84. Wu, X.-M.; Zuo, Z.-T.; Zhang, Q.-Z.; Wang, Y.-Z. Classification of Paris Species According to Botanical and Geographical Origins Based on Spectroscopic, Chromatographic, Conventional Chemometric Analysis and Data Fusion Strategy. Microchem. J. 2018, 143, 367–378. [Google Scholar] [CrossRef]
  85. Wingfield, J.; Wilson, I.D. Advances in Mass Spectrometry Within Drug Discovery. SLAS Discov. 2016, 21, 109–110. [Google Scholar] [CrossRef] [PubMed]
  86. Dai, S.; Lin, Z.; Xu, B.; Wang, Y.; Shi, X.; Qiao, Y.; Zhang, J. Metabolomics Data Fusion between near Infrared Spectroscopy and High-Resolution Mass Spectrometry: A Synergetic Approach to Boost Performance or Induce Confusion. Talanta 2018, 189, 641–648. [Google Scholar] [CrossRef] [PubMed]
  87. Song, X.-C.; Canellas, E.; Asensio, E.; Nerín, C. Predicting the Antioxidant Capacity and Total Phenolic Content of Bearberry Leaves by Data Fusion of UV–Vis Spectroscopy and UHPLC/Q-TOF-MS. Talanta 2020, 213, 120831. [Google Scholar] [CrossRef] [PubMed]
  88. Qi, L.; Zhang, J.; Zhao, Y.; Zuo, Z.; Wang, Y.-Z.; Jin, H. Characterization of Gentiana Rigescen s by Ultraviolet–Visible and Infrared Spectroscopies with Chemometrics. Anal. Lett. 2017, 50, 1497–1511. [Google Scholar] [CrossRef]
  89. Yao, S.; Li, T.; Liu, H.-G.; Li, J.-Q.; Wang, Y.-Z. Geographic Characterization of Leccinum Rugosiceps by Ultraviolet and Infrared Spectral Fusion. Anal. Lett. 2017, 50, 2257–2269. [Google Scholar] [CrossRef]
  90. Wang, Q.; Zuo, Z.; Huang, H.; Wang, Y. Comparison and Quantitative Analysis of Wild and Cultivated Macrohyporia Cocos Using Attenuated Total Refection-Fourier Transform Infrared Spectroscopy Combined with Ultra-Fast Liquid Chromatography. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2020, 226, 117633. [Google Scholar] [CrossRef]
  91. Li, L.; Zhang, S.; Zuo, Z.; Wang, Y. Data Fusion of Multiple-information Strategy Based on Fourier Transform near Infrared Spectroscopy and Fourier-transform Mid Infrared for Geographical Traceability of Wolfiporia Cocos Combined with Chemometrics. J. Chemom. 2022, 36, 9. [Google Scholar] [CrossRef]
  92. Wu, X.-M.; Zhang, Q.-Z.; Wang, Y.-Z. Traceability of Wild Paris Polyphylla Smith Var. Yunnanensis Based on Data Fusion Strategy of FT-MIR and UV–Vis Combined with SVM and Random Forest. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2018, 205, 479–488. [Google Scholar] [CrossRef] [PubMed]
  93. Zhang, J.; Wang, Y.Z.; Yang, M.Q.; Yang, W.Z.; Yang, S.B.; Zhang, J.Y. Identification and Evaluation of Polygonatum Kingianum with Different Growth Ages Based on Data Fusion Strategy. Microchem. J. 2021, 160, 105662. [Google Scholar] [CrossRef]
  94. Wang, Y.; Yang, Y.; Sun, H.; Dai, J.; Zhao, M.; Teng, C.; Ke, Z.; Yang, M.; Zhong, L.; Zhu, W. Application of a Data Fusion Strategy Combined with Multivariate Statistical Analysis for Quantification of Puerarin in Radix Puerariae. Vib. Spectrosc. 2020, 108, 103057. [Google Scholar] [CrossRef]
  95. Tao, L.; Via, B.; Wu, Y.; Xiao, W.; Liu, X. NIR and MIR Spectral Data Fusion for Rapid Detection of Lonicera Japonica and Artemisia Annua by Liquid Extraction Process. Vib. Spectrosc. 2019, 102, 31–38. [Google Scholar] [CrossRef]
  96. Wang, Y.; Huang, H.-Y.; Wang, Y.-Z. Authentication of Dendrobium Officinale from Similar Species with Infrared and Ultraviolet-Visible Spectroscopies with Data Visualization and Mining. Anal. Lett. 2020, 53, 1774–1793. [Google Scholar] [CrossRef]
  97. Tian, M.; Ma, X.; Liang, M.; Zang, H. Application of Rapid Identification and Determination of Moisture Content of Coptidis Rhizoma from Different Species Based on Data Fusion. J. AOAC Int. 2023, 106, 1389–1401. [Google Scholar] [CrossRef] [PubMed]
  98. Xiao, Q.; Bai, X.; Gao, P.; He, Y. Application of Convolutional Neural Network-Based Feature Extraction and Data Fusion for Geographical Origin Identification of Radix Astragali by Visible/Short-Wave Near-Infrared and Near Infrared Hyperspectral Imaging. Sensors 2020, 20, 4940. [Google Scholar] [CrossRef]
  99. Liu, S.; Liu, H.; Li, J.; Wang, Y. Solving the Identification Problems of Bolete Origins Based on Multiple Data Processing: Take Boletus Bainiugan as an Example. J. Food Compos. Anal. 2023, 124, 105693. [Google Scholar] [CrossRef]
  100. Masnan, M.J.; Mahat, N.I.; Zakaria, A.; Shakaff, A.Y.M.; Adom, A.H.; Sa’ad, F.S.A. Enhancing Classification Performance of Multisensory Data through Extraction and Selection of Features. Procedia Chem. 2012, 6, 132–140. [Google Scholar] [CrossRef]
  101. Wang, Q.-Q.; Huang, H.-Y.; Wang, Y.-Z. Geographical Authentication of Macrohyporia Cocos by a Data Fusion Method Combining Ultra-Fast Liquid Chromatography and Fourier Transform Infrared Spectroscopy. Molecules 2019, 24, 1320. [Google Scholar] [CrossRef]
  102. Wang, L.; Wang, Q.; Wang, Y.; Wang, Y. Comparison of Geographical Traceability of Wild and Cultivated Macrohyporia Cocos with Different Data Fusion Approaches. J. Anal. Methods Chem. 2021, 2021, 1–13. [Google Scholar] [CrossRef] [PubMed]
  103. Lan, Z.; Zhang, Y.; Sun, Y.; Ji, D.; Wang, S.; Lu, T.; Cao, H.; Meng, J. A Mid-Level Data Fusion Approach for Evaluating the Internal and External Changes Determined by FT-NIR, Electronic Nose and Colorimeter in Curcumae Rhizoma Processing. J. Pharm. Biomed. Anal. 2020, 188, 113387. [Google Scholar] [CrossRef] [PubMed]
  104. Massaro, A.; Negro, A.; Bragolusi, M.; Miano, B.; Tata, A.; Suman, M.; Piro, R. Oregano Authentication by Mid-Level Data Fusion of Chemical Fingerprint Signatures Acquired by Ambient Mass Spectrometry. Food Control. 2021, 126, 108058. [Google Scholar] [CrossRef]
  105. Zhang, L.-N.; Wang, L.; Shi, Z.-Q.; Li, P.; Li, H.-J. A Metabolomic Strategy Based on Integrating Headspace Gas Chromatography-Mass Spectrometry and Liquid Chromatography-Mass Spectrometry to Differentiate the Five Cultivars of Chrysanthemum Flower. RSC Adv. 2018, 8, 9074–9082. [Google Scholar] [CrossRef] [PubMed]
  106. Wong, K.H.; Razmovski-Naumovski, V.; Li, K.M.; Li, G.Q.; Chan, K. Differentiating Puerariae Lobatae Radix and Puerariae Thomsonii Radix Using HPTLC Coupled with Multivariate Classification Analyses. J. Pharm. Biomed. Anal. 2014, 95, 11–19. [Google Scholar] [CrossRef] [PubMed]
  107. Chen, J.; Gong, D.; Liu, X.; Sun, G.; Sun, W. Quality and Antioxidant Activity Evaluation of Dandelion by HPLC with Five-Wavelength Fusion Fingerprint. New J. Chem. 2021, 45, 9856–9863. [Google Scholar] [CrossRef]
  108. Shen, T.; Yu, H.; Wang, Y. Assessing Geographical Origin of Gentiana Rigescens Using Untargeted Chromatographic Fingerprint, Data Fusion and Chemometrics. Molecules 2019, 24, 2562. [Google Scholar] [CrossRef]
  109. Dubin, E.; Spiteri, M.; Dumas, A.-S.; Ginet, J.; Lees, M.; Rutledge, D.N. Common Components and Specific Weights Analysis: A Tool for Metabolomic Data Pre-Processing. Chemom. Intell. Lab. Syst. 2016, 150, 41–50. [Google Scholar] [CrossRef]
  110. Smith, R.; Ventura, D.; Prince, J.T. LC-MS Alignment in Theory and Practice: A Comprehensive Algorithmic Review. Brief. Bioinform. 2015, 16, 104–117. [Google Scholar] [CrossRef]
  111. Charve, J.; Chen, C.; Hegeman, A.D.; Reineccius, G.A. Evaluation of Instrumental Methods for the Untargeted Analysis of Chemical Stimuli of Orange Juice Flavour: Evaluation of Instrumental Methods for Flavoromic Research. Flavour Fragr. J. 2011, 26, 429–440. [Google Scholar] [CrossRef]
  112. Belmonte-Sánchez, J.R.; Romero-González, R.; Martínez Vidal, J.L.; Arrebola, F.J.; Garrido Frenich, A. 1H NMR and Multi-Technique Data Fusion as Metabolomic Tool for the Classification of Golden Rums by Multivariate Statistical Analysis. Food Chem. 2020, 317, 126363. [Google Scholar] [CrossRef] [PubMed]
  113. Di Rosa, A.R.; Leone, F.; Cheli, F.; Chiofalo, V. Fusion of Electronic Nose, Electronic Tongue and Computer Vision for Animal Source Food Authentication and Quality Assessment—A Review. J. Food Eng. 2017, 210, 62–75. [Google Scholar] [CrossRef]
  114. Jiang, Y.; Liu, Y.; Zhan, W.; Zhu, D. Improved Thermal Infrared Image Super-Resolution Reconstruction Method Base on Multimodal Sensor Fusion. Entropy 2023, 25, 914. [Google Scholar] [CrossRef] [PubMed]
  115. Lin, Y.-W.; Lin, Y.-B.; Hung, H.-N. CalibrationTalk: A Farming Sensor Failure Detection and Calibration Technique. IEEE Internet Things J. 2021, 8, 6893–6903. [Google Scholar] [CrossRef]
  116. Guerrero-Ibáñez, J.; Zeadally, S.; Contreras-Castillo, J. Sensor Technologies for Intelligent Transportation Systems. Sensors 2018, 18, 1212. [Google Scholar] [CrossRef] [PubMed]
  117. Lin, K.; Li, Y.; Sun, J.; Zhou, D.; Zhang, Q. Multi-Sensor Fusion for Body Sensor Network in Medical Human–Robot Interaction Scenario. Inf. Fusion 2020, 57, 15–26. [Google Scholar] [CrossRef]
  118. Pan, M.H.; Su, Y.F.; Liu, X.J.; Xiang, S.Q.; Ding, Y.X.; Li, Q. Identification of Forsythia Suspensa (Thunb.) Vahl in Different Harvest Periods Using Intelligent Sensory Technologies, HPLC Characteristic Fingerprint Coupled with Chemometrics. Phytochem. Anal. 2022, 33, 490–501. [Google Scholar] [CrossRef]
  119. Fei, C.; Ren, C.; Wang, Y.; Li, L.; Li, W.; Yin, F.; Lu, T.; Yin, W. Identification of the Raw and Processed Crataegi Fructus Based on the Electronic Nose Coupled with Chemometric Methods. Sci. Rep. 2021, 11, 1849. [Google Scholar] [CrossRef]
  120. Zhang, X.; Wu, H.; Lin, L.; Du, X.; Tang, S.; Liu, H.; Yang, H. The Qualitative and Quantitative Assessment of Xiaochaihu Granules Based on E-Eye, e-Nose, e-Tongue and Chemometrics. J. Pharm. Biomed. Anal. 2021, 205, 114298. [Google Scholar] [CrossRef]
  121. Wang, S.; Lin, Z.; Zhang, B.; Du, J.; Li, W.; Wang, Z. Data Fusion of Electronic Noses and Electronic Tongues Aids in Botanical Origin Identification on Imbalanced Codonopsis Radix Samples. Sci. Rep. 2022, 12, 19120. [Google Scholar] [CrossRef]
  122. Miao, J.; Luo, Z.; Wang, Y.; Li, G. Comparison and Data Fusion of an Electronic Nose and Near-Infrared Reflectance Spectroscopy for the Discrimination of Ginsengs. Anal. Methods 2016, 8, 1265–1273. [Google Scholar] [CrossRef]
  123. Jing, W.; Zhao, X.; Li, M.; Hu, X.; Cheng, X.; Ma, S.; Wei, F. Application of Multiple-Source Data Fusion for the Discrimination of Two Botanical Origins of Magnolia Officinalis Cortex Based on E-Nose Measurements, E-Tongue Measurements, and Chemical Analysis. Molecules 2022, 27, 3892. [Google Scholar] [CrossRef] [PubMed]
  124. Borràs, E.; Ferré, J.; Boqué, R.; Mestres, M.; Aceña, L.; Busto, O. Data Fusion Methodologies for Food and Beverage Authentication and Quality Assessment—A Review. Anal. Chim. Acta 2015, 891, 1–14. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Schemes of the low-, mid-, and high-level data fusion.
Figure 1. Schemes of the low-, mid-, and high-level data fusion.
Sensors 24 00106 g001
Figure 2. The data fusion process for geographical traceability of Paris Polyphylla Var. Yunnanensis. (a) low-level data fusion; (b) mid-level data fusion; (c) high-level data fusion.
Figure 2. The data fusion process for geographical traceability of Paris Polyphylla Var. Yunnanensis. (a) low-level data fusion; (b) mid-level data fusion; (c) high-level data fusion.
Sensors 24 00106 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, R.; Ma, S.; Dai, S.; Zheng, J. Application of Data Fusion in Traditional Chinese Medicine: A Review. Sensors 2024, 24, 106. https://doi.org/10.3390/s24010106

AMA Style

Huang R, Ma S, Dai S, Zheng J. Application of Data Fusion in Traditional Chinese Medicine: A Review. Sensors. 2024; 24(1):106. https://doi.org/10.3390/s24010106

Chicago/Turabian Style

Huang, Rui, Shuangcheng Ma, Shengyun Dai, and Jian Zheng. 2024. "Application of Data Fusion in Traditional Chinese Medicine: A Review" Sensors 24, no. 1: 106. https://doi.org/10.3390/s24010106

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop