**Assessing Geographical Origin of** *Gentiana Rigescens* **Using Untargeted Chromatographic Fingerprint, Data Fusion and Chemometrics**

#### **Tao Shen 1,2,3, Hong Yu 1,2,\* and Yuan-Zhong Wang 4**


Academic Editor: Marcello Locatelli

Received: 10 June 2019; Accepted: 12 July 2019; Published: 14 July 2019

**Abstract:** *Gentiana rigescens* Franchet, which is famous for its bitter properties, is a traditional drug of chronic hepatitis and important raw materials for the pharmaceutical industry in China. In the study, high-performance liquid chromatography (HPLC), coupled with diode array detector (DAD) and chemometrics, were used to investigate the chemical geographical variation of *G. rigescens* and to classify medicinal materials, according to their grown latitudes. The chromatographic fingerprints of 280 individuals and 840 samples from rhizomes, stems, and leaves of four di fferent latitude areas were recorded and analyzed for tracing the geographical origin of medicinal materials. At first, HPLC fingerprints of underground and aerial parts were generated while using reversed-phase liquid chromatography. After the preliminary data exploration, two supervised pattern recognition techniques, random forest (RF) and orthogonal partial least-squares discriminant analysis (OPLS-DA), were applied to the three HPLC fingerprint data sets of rhizomes, stems, and leaves, respectively. Furthermore, fingerprint data sets of aerial and underground parts were separately processed and joined while using two data fusion strategies ("low-level" and "mid-level"). The results showed that classification models that are based OPLS-DA were more e fficient than RF models. The classification models using low-level data fusion method built showed considerably good recognition and prediction abilities (the accuracy is higher than 99% and sensibility, specificity, Matthews correlation coe fficient, and e fficiency range from 0.95 to 1.00). Low-level data fusion strategy combined with OPLS-DA could provide the best discrimination result. In summary, this study explored the latitude variation of phytochemical of *G. rigescens* and developed a reliable and accurate identification method for *G. rigescens* that were grown at di fferent latitudes based on untargeted HPLC fingerprint, data fusion, and chemometrics. The study results are meaningful for authentication and the quality control of Chinese medicinal materials.

**Keywords:** authentication; liquid chromatography fingerprint; chemometrics; random forest; OPLS-DA; data fusion; *Gentiana rigescens*
