Next Article in Journal
Polydopamine-Modified Copper Coordination Mesoporous Silica Nanoparticles Loaded with Disulfiram for Synergistic Chemo-Photothermal Therapy
Previous Article in Journal
Combination of the Topical Photodynamic Therapy of Chloroaluminum Phthalocyanine Liposomes with Fexinidazole Oral Self-Emulsifying System as a New Strategy for Cutaneous Leishmaniasis Treatment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Based Analysis Reveals Triterpene Saponins and Their Aglycones in Cimicifuga racemosa as Critical Mediators of AMPK Activation

1
Medical Department, Max Zeller Söhne AG, 8590 Romanshorn, Switzerland
2
Clinical Pharmacology and Toxicology, Department of General Internal Medicine, Inselspital—University Hospital, 3010 Bern, Switzerland
*
Author to whom correspondence should be addressed.
Pharmaceutics 2024, 16(4), 511; https://doi.org/10.3390/pharmaceutics16040511
Submission received: 28 February 2024 / Revised: 14 March 2024 / Accepted: 5 April 2024 / Published: 7 April 2024
(This article belongs to the Special Issue Recent Advances in Natural Product Drugs, 2nd Edition)

Abstract

:
Cimicifuga racemosa (CR) extracts contain diverse constituents such as saponins. These saponins, which act as a defense against herbivores and pathogens also show promise in treating human conditions such as heart failure, pain, hypercholesterolemia, cancer, and inflammation. Some of these effects are mediated by activating AMP-dependent protein kinase (AMPK). Therefore, comprehensive screening for activating constituents in a CR extract is highly desirable. Employing machine learning (ML) techniques such as Deep Neural Networks (DNN), Logistic Regression Classification (LRC), and Random Forest Classification (RFC) with molecular fingerprint MACCS descriptors, 95 CR constituents were classified. Calibration involved 50 randomly chosen positive and negative controls. LRC achieved the highest overall test accuracy (90.2%), but DNN and RFC surpassed it in precision, sensitivity, specificity, and ROC AUC. All CR constituents were predicted as activators, except for three non-triterpene compounds. The validity of these classifications was supported by good calibration, with misclassifications ranging from 3% to 17% across the various models. High sensitivity (84.5–87.2%) and specificity (84.1–91.4%) suggest suitability for screening. The results demonstrate the potential of triterpene saponins and aglycones in activating AMP-dependent protein kinase (AMPK), providing the rationale for further clinical exploration of CR extracts in metabolic pathway-related conditions.

Graphical Abstract

1. Introduction

Extracts of Cimicifuga racemosa L., NUTT. (also known as Actaea racemosa L. or black cohosh) are widely accepted [1,2,3,4] and have been granted “well-established use” status in the treatment of postmenopausal (i.e., climacteric) complaints by the European Medicines Agency [5]. This monograph predominantly includes vasomotor symptoms such as hot flushes and sweating, as well as nervousness, irritability, and metabolic changes. Although characteristic postmenopausal complaints have been known for a very long time and the beneficial effects of Cimicifuga extracts on climacteric symptoms are well accepted [3,4], the mechanism of actions has not yet been fully elucidated.
As well as clinical studies involving female patients, Seidlova-Wuttke et al. (2012) [6] undertook a comprehensive investigation aimed at delving into the beneficial impacts of a CR extract on postmenopausal symptoms in ovariectomized rats. In addition to the commonly reported climacteric effects, the authors were able to discern noteworthy reductions in fat accumulation and a decrease in the manifestations of metabolic syndrome in these animals. As AMP-activated protein kinase (AMPK) plays a pivotal role in regulating cellular metabolism [7], Moser et al. [8] investigated the effect of a CR extract Ze 450 and three of its isolated components (23-epi-26-deoxyactein, protopine, and Cimiracemoside C) on AMPK activity and carbohydrate metabolism in HepaRG cells and male ob/ob mice.
The extract and its components activated AMPK to the same extent as the AMPK activator metformin. The results also showed the extract led to significant reductions in body weight and plasma glucose levels, while improving glucose metabolism and insulin sensitivity in male diabetic ob/ob mice [8]. These findings broadened the mechanism of action of Cimicifuga in various domains to include the activation of AMPK and the subsequent effect on cellular metabolism, as indicated by a recent review discussion [9]. This new perspective brings new areas of application such as metabolic disorders, cardiovascular diseases, obesity, anti-aging, antioxidative, and supportive antiproliferative therapy into the focus of future clinical developments.
When examining the literature on published AMPK activators, the substantial chemical and pharmacological heterogeneity of the activators becomes evident. While only a handful of these (naturally occurring) activators directly target the enzyme itself, such as salicylate or AMP, the majority exert their effects indirectly. They achieve this by either influencing upstream kinases that subsequently phosphorylate AMPK or by reducing cellular ATP levels, leading to AMPK phosphorylation and subsequent activation. In particular, a variety of plant extracts or isolated plant constituents have been described in the literature to activate the enzyme [10,11,12].
The primary class of naturally occurring metabolites that may activate AMPK is the class of triterpene saponins and polyphenols such as flavonoids, courcumin, stilbenes, and others may also do so [13,14,15]. The class of triterpene saponins is widely distributed throughout the plant kingdom and constitutes a large and diverse group of secondary metabolites. They consist of a hydrophobic (water-repelling) aglycone, which can be steroidal or triterpenoid in nature, and one or more hydrophilic sugar moieties known as glycosides. These sugar moieties can be either monosaccharides or oligosaccharides and exhibit variations in their structure, size, and composition. The most common sugar moieties in steroidal saponins include glucose, galactose, rhamnose, xylose, and arabinose, which can undergo further metabolic processes. The type and number of sugar moieties attached to the steroid or triterpenoid aglycone affect the physicochemical properties and biological activities of the saponins, such as their solubility, stability, and bioavailability [16]. Saponins usually have unfavorable physiochemical properties for oral absorption due to their large molecular mass and hydrophilicity, which hinders enteral absorption and cellular uptake [17]. Hence, biotransformation to aglycones by cleavage of the glycosidic sugars may significantly alter cellular availability and consequently affect their pharmacological effects. Notably, certain saponins undergo deglycosidation by colonic microflora leading to enhanced intestinal absorption of the lipophilic aglycones. This is observed in the cases of certain ginsenosides and soybean saponins [18,19,20]. These compounds may also have a higher probability of entering their target cells.
When investigating herbal remedies, experiments can be challenging. The herbal extracts are complex and often contain multiple substances. Additionally, obtaining pure isolated compounds from these extracts can be difficult.
This presents an opportunity where machine learning models can significantly enhance the classification of activator constituents. Machine learning offers the possibility of thorough screening of these complex mixtures so that key compounds can be accurately identified, thereby streamlining subsequent detailed analysis and testing.
Recently, we have published research about sensitive and accurate machine learning models for the classification of AMPK activators [12]. In the present study, an extended and updated version of this applied database of known activators and controls has been used to classify all chemically characterized constituents of the Cimicifuga extract Ze 450 to estimate its ability to activate AMPK.

2. Materials and Methods

The flow and structure of experiments are illustrated in the following Figure 1:

2.1. Data

A highly detailed AMPK dataset was compiled in 2021 [13] and recently updated in August 2023. It was compiled by a thorough literature review of AMPK activators and inhibitors, conducted on PubMed (https://pubmed.ncbi.nlm.nih.gov/, accessed on 4 April 2024) using the search terms “AMPK AND activation” and “AMPK AND inhibition”. Compounds were included if they were confirmed activators or inhibitors by at least one publication listed on PubMed. Additionally, the Bioassay database of PubChem Substance and Compound databases (https://pubchem.ncbi.nlm.nih.gov/, accessed on 4 April 2024) was consulted, particularly when compounds exhibited an EC50 of ≤0.1 µM, indicating activation. Conversely, compounds that were tested and found to be inactive for AMPK activation or exhibited inhibitory activity were used as the control group for this analysis. In total, the database comprised N = 1120 and N = 815 active compounds or controls, respectively.
To comprehensively characterize the power of Cimicifuga racemosa, 95 chemically defined compounds from the rhizome were included for analysis [21] (see Table A1, Appendix B).

2.2. Data Preprocessing

Chemical structures were coded using the simplified molecular-input line-entry system (isomeric SMILES taken from PubChem). Data were used to calculate MACCS fingerprint descriptors (Molecular ACCess System, [22]). MACCS fingerprint descriptors are binary representations encoding the presence or absence of specific structural features or substructures within a molecule. They are represented by a fixed-length vector of 166 bits with “0” values indicating absence and “1” values indicating presence. They do not encode information about bond order, stereochemistry, or spatial arrangement of atoms. Despite these limitations, fingerprint descriptors are commonly used in cheminformatics and computational chemistry. Since MACCS fingerprints focus on specific structural features, they are effective at capturing chemical diversity in a dataset [23].
Finally, data preprocessing (curation) entailed eliminating duplicate entries, salts, mixtures, smaller fragments, and proteins from SMILES structures, with a focus on low molecular weight drug-like compounds (molecular weight < 1000). Lastly, tautomers were not standardized during this process.
To reduce computational effort and noise, the VarianceThreshold feature selection method was used to remove features with low variance (<0.01%).
The unbalanced distribution of activators and controls was compensated for by the Synthetic Minority Oversampling Technique (SMOTE, [24]), which generates synthetic samples for the minority class by interpolating between existing samples. It creates new samples that are combinations of neighboring samples, resulting in an even class distribution (1122 members for each class). SMOTE was only applied in the training and not in the test phase.

2.3. Validation

Validation of models was based on OECD Principles for (Q)SAR Validation [25] using the 2:1 random split of the 2244 total data into 1570 training and 674 test data. These training data were further split (5:1 ratio) into a validation training dataset (N = 1258) and a validation test dataset (N = 314) to optimize model hyperparameters and train the models (using the sklearn train-test split method). After completion of training, the test data served as an external control using 5-fold cross-validation. Furthermore, the training was repeated after randomization of the response variable (Y-randomization [26]).
The high-dimensional data of activators and controls were transformed into a two-dimensional space using the t-distributed stochastic neighbor embedding technique (tSNE). This method offers a visual representation of the structural relationship between various compounds, aiding in the interpretation of the database’s applicability domain [27].

2.4. Machine Learning Models

The following three machine learning techniques were applied: Deep Neural Networks, Logistic Regression Classification, and Random Forest Classification.
All calculations were performed using Python 3.11.2 (https://www.python.org/, accessed on 4 April 2024). Graphical analysis was carried out using OriginPro, version 2023, OriginLab Corporation, Northampton, MA, USA, or Matplotlib, version 3.3.3 (https://matplotlib.org/#, accessed on 5 April 2024).

2.4.1. Deep Neural Network (DNN)

DNNs are sophisticated computational models with multiple interconnected layers, allowing them to automatically learn hierarchical representations of complex patterns from data [28]. Their depth enables effective feature extraction and is a key factor in their success across various machine learning tasks.
The data were assessed using a sequential DNN model, featuring a variable number of dense, hidden, and dropout layers, with HeNormal as the kernel initializer and Constant (value = 0) as the bias initializer. The activation functions employed were the exponential linear unit (ELU) for positive values and sigmoid for the output layers. Binary cross-entropy was utilized as the loss function. Details of the model are given in Appendix A.

2.4.2. Logistic Regression Classification (LRC)

LRC [29] is a powerful and widely used statistical method for modeling the probability of a binary outcome based on one or more independent variables.
LRC is used to estimate the probability p ^ that an instance belongs to a class:
p ^ = h θ x = σ θ T · x ,
using the logistic function:
σ t = 1 1 + e t .
Binary classification for two classes denoted with 0 and 1 was obtained by
y ^ = σ t = 0 ,     p ^ < 0.5 1 ,     p ^ 0.5
The scikit-learn procedure was used (https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html, accessed on 4 April 2024).

2.4.3. Random Forest Classification (RFC)

RFC, an ensemble method (https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html, accessed on 4 April 2024), enhances generalizability and robustness by aggregating multiple base estimators, surpassing the performance of individual estimators such as decision trees. Each base estimator in the sequence aims to minimize the bias of the combined estimator. Renowned for classification tasks, RFCs are adept decision tree algorithms. Hyperparameters were optimized through grid search analysis, covering the number of estimators, maximum features utilized, maximum tree depth, minimum samples for split and leaf, and impurity criterion. Notably, no bootstrap sampling was employed in the process.

2.5. Hyperparameter Tuning

The hyperparameter tuning was performed on both the validation training dataset (N = 1258) and a validation test dataset (N = 314), which was derived with a 5:1 split using the train-test split method to optimize model hyperparameters and train the models.
Some of the adjustable hyperparameters of the investigated models were tuned by grid search, which was coupled with a 5-fold cross-validation (using sklearn GridSearchCV module), the others were kept in their default settings. Specifically for logistic regression, we focused on two key hyperparameters: the inverse of the regularization strength, denoted as “C”, and the penalty functions, which could be either “l1” (Lasso), “l2” (Ridge) regression, or “elasticnet” (a combination of “l1” (Lasso) and “l2” (Ridge)). These penalty functions help to control the impact of large coefficients in the model, thereby discouraging it from fitting noise into the data. Additionally, we determined the optimal solver among various options, which included the Newton-conjugate gradient optimization method (“Newton-cg”), the Limited-memory Broyden–Fletcher–Goldfarb–Shanno optimization method (“lbfgs”), a linear programming approach (“liblinear”).
For DNN, a grid search was performed on learning rate, batch size, number of hidden layers, and dropout layers.

2.6. Model Evaluation

The dataset underwent partitioning using the sklearn.model_selection preprocessing method train_test_split, allocating 30% for testing and 70% for training. Subsequently, a 5-fold cross-validation (CV) was performed.
To compare data distributions and assess the application domain, t-distributed stochastic neighbor embedding analysis was conducted via the sklearn.manifold.TSNE procedure. This technique transforms high-dimensional data into a 2-dimensional representation, facilitating graphical evaluation of applicability domains.
Machine learning model performance was evaluated using the following metrics:
  • Accuracy: (TP + TN)/(TP + TN + FP + FN);
  • Precision: TP/(TP + FP);
  • Sensitivity: TP/(TP + FN);
  • Specificity: TN/(TN + FP).
Here, TP represents true positives (correctly predicted activators), FP denotes false positives (incorrectly predicted activators), TN signifies true negatives (correctly predicted controls), and FN stands for false negatives (incorrectly predicted controls).

2.7. Prevention of Overfitting

Overfitting is a common problem in machine learning and statistical modeling, and it occurs when a model learns to perform very well on the training data but fails to generalize its predictions to new, unseen data.
One important risk factor is an unbalanced distribution of activators and controls in our database. This is an inherent problem in AMPK activation. Due to the importance of this activation, many potential activator compounds have been tested experimentally, whereas a much smaller number of negative controls (often inhibitors) have been investigated. This leads to a bias in the reported results within the literature. To significantly minimize the risk of overfitting, various methodical precautions were undertaken.

2.7.1. Feature Selection

Since more complex models have a greater risk of model noise and are prone to overfitting, we simplified our models by eliminating those features that contribute information only marginally (e.g., have a variance threshold below 0.01).

2.7.2. Cross-Validation

Cross-validation, especially the 5-fold variant during hyperparameter tuning followed by a 10-fold variant coupled to the ROC analysis (see below), is a machine learning technique that gauges predictive model performance and generalization. It does this by splitting the dataset into ten roughly equal parts or “folds”. The model is trained on nine of these parts and tested on the remaining one. This process is repeated ten times, with each fold serving as the test set once.
The performance metrics (in our case accuracy) from these ten rounds were then averaged to judge the model’s overall performance. It is a powerful method for comprehensively evaluating a model’s capabilities. It is more robust than a single train-test split because it examines how well the model generalizes different subsets of data.

2.7.3. Regularization

For logistic regression: an application of regularization techniques like L1 (Lasso) or L2 (Ridge) regression or elastic net option was used to penalize large coefficients in the model. This discourages the model from fitting noise into the data. The parameter C denotes the inverse of the regularization strength. The choice between these techniques was made in the tuning of hyperparameters by the grid search procedure. For DNN, dropout layers were evaluated.

2.7.4. Early Stopping

For DNN training, an early stopping procedure (keras.callbacks module EarlyStopping, https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping, accessed on 4 April 2024) was applied to monitor the training loss and halt training if there was no improvement for five consecutive epochs.

2.8. Receiver Operating Characteristic (ROC)

To assess the performance of a binary classifier regardless of thresholds, the receiver operating characteristic (ROC) curve and its corresponding area under the curve (AUC) scores were computed [30]. This evaluation was complemented with a 10-fold cross-validation to ensure the robustness and generalizability of the results.

2.9. y-Randomization

A final aspect of method validation is y-randomization. In this step, the DNN was applied to the molecular descriptors (denoted by X) unchanged, while the target y was randomized (null model). The performance was then measured. If the original model significantly outperformed the null model, it suggested a meaningful relationship between the molecular descriptors (X) and biological activity (denoted by y) in our dataset. In such a scenario, it provided confidence in the predictive power of our model. To enhance confidence further, this process was repeated 50 times.

2.10. Classification of Cimicifuga racemosa (CR) Constituents

Using the SMILES of the CR constituents, the same molecular descriptors were calculated for the database. While the database was fitted to a standardizer and transformed, the CR descriptors were only transformed using the same standardizer. Using the best-performing model of the training, the CR constituents were predicted as either AMPK activators or controls.
To calibrate these classifications, 50 randomly chosen samples of the positive and negative controls of the database were each also classified in the same run. The models were ranked by the number of misclassifications.

2.10.1. Comparison of Cimicifuga racemosa (CR) Metabolites with Database

The best-performing model from the analysis was then employed to classify the transformed CR constituent descriptors. For each CR constituent, the five most similar members of the database were determined through pairwise calculation of cosine similarity scores (k) using scikit-learn (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html, accessed on 4 April 2024):
k ( x , v ) = < x , y > x · y ,
where · denotes the Euclidian norm and <x,y> denotes the dot product of vectors x and y. It ranges from −1 to 1. Values of k > 0.8 were regarded as similar.

2.10.2. Comparison of Cimicifuga racemosa (CR) Saponins with Their Estimated Aglycones

In total, 46 of the CR constituents were identified as saponins. Their original SMILES codes were theoretically deglycosylated, following the approach suggested by SwissADME [31], to generate new SMILES codes for their corresponding aglycones. These new SMILES codes were then used to generate descriptors from the estimated aglycones for classification.

2.10.3. Assessment of Markers for Oral Absorption

A comparison between triterpene saponin constituents and their aglycones was conducted using the web tool SwissADME [31] available at http://www.swissadme.ch, accessed on 4 April 2024. This tool utilizes robust and predictive models for physicochemical properties, pharmacokinetics, and drug-likeness. It allowed us to estimate several parameters considered as indicators for the oral bioavailability of drugs, including molecular weight (MW), water solubility [32], topological polar surface area (TPSA; [33]), distribution coefficient XlogP [34], the number of violations of Lipinski’s rule of five [35], and the estimated lead-likeness [31].

3. Results

3.1. t-SNE Analysis

The t-SNE graphical analysis indicates a clear separation between the two classes, namely activators and controls, across the MACCS fingerprint descriptors (Figure 2):
For illustration, the distribution of four important parameters between activators and controls is displayed in Figure 3:

3.2. Feature Reduction

Variance threshold reduction simplified the models by reducing the number of features to 139 for the MACCS fingerprint descriptors from their initial counts of 166.

3.3. Hyperparameter Tuning

For the MACCS fingerprint descriptors a batch size of 16, no dropout layers, a learning rate of 0.001, and three hidden layers were found to be optimal for the DNN model. As a solver, the Adam optimizer and the binary cross-entropy as loss functions were used.
For the LRC model, a regularization strength C of 0.5, a L2 penalty, and the liblinear solver were selected, and “newton-cg” for the solver was estimated to be optimal parameters. For RFC, the gini criterion was chosen, the maximum features were set to log2 (number of features), the min_samples_leaf and min_samples_split were set to 1 and 4, respectively, and the number of estimators was set to 110.
All other parameters were left at their default settings.

3.4. Test Performances

In evaluating the performance of various machine learning techniques, all models demonstrated a commendable accuracy level of approximately 90%. Notably, the DNN model exhibited superior performance compared with other models by minimizing the number of misclassifications on the calibration data. With DNN, there were only three misclassifications, in contrast to 17 for LRC and 9 for the RFC model.
While the LRC model achieved the highest overall test accuracy at 90.2%, both the DNN and RFC models surpassed it in terms of precision, sensitivity, specificity, and ROC AUC, as summarized in Table 1.
All models utilized the MinMax Scaler for data scaling prior to modeling. As a side note, the RFC model was also evaluated without prior scaling, producing identical results to those obtained with scaled data.
The area under the receiver operating characteristic curve (ROC AUC) assesses a model’s capacity to differentiate between activator and control classes across various thresholds. These curves (Figure 4) were combined using a 10-fold cross-validation. A higher ROC AUC value indicates better class discrimination, with the optimal value being 1.0 or −1.0.

3.5. y-Randomization

Notably, in none of the 50 shuffled models could a distinction be made between activators and controls (see Table 1). The mean accuracy ranged from 57.6% ± 1.8% to 57.8% ± 1.8%. These results suggest that the unchanged models are statistically significant and are unlikely to have arisen by chance. This provides confidence in the predictive power of our models.

3.6. Classification of Cimicifuga racemosa (CR) Constituents

For classification, 103 chemically defined CR root compounds were identified [21] and checked for isomeric SMILES codes by using the PubChem database (https://pubchem.ncbi.nlm.nih.gov/, accessed on 4 April 2024). In total, 95 distinct compounds with all information available were used for analysis (see Table A1, Appendix B).
All compounds with triterpene and triterpenoid structures were classified as active. This classification is supported by the literature for 23-Epi-26-deoxyactein and cimiracemoside C [8]. From the non-triterpene compounds, the cinnamic, benzoic, or fukiic acid derivatives were clearly classified as active. A literature search supported this classification for synaptic acid [36], P-coumaric acid [11], isoferulic acid [37], protocatechuic acid [37], and protocatechuic aldehyde [38]. Compounds such as cimiracemates, cimiphenones, cimifugic acid derivatives, and actealactone were likewise classified as active. Among the chromones—angelicain, cimifugin, and visnagin—only angelicain and cimifugin were classified as active, whereas visnagin was classified as inactive, possibly due to the absence of a propan-2-ol group. Interestingly, the glycoside cimidahurin was classified as active. However, its aglycone hydroxytyrosol, and not the compound itself, was identified in the literature as an activator of AMPK [39]. For the chemical structures, see Appendix B: Table A1.
Further support for these classifications came from a similarity comparison of the CR constituents against our database. The constituents demonstrated high similarity to database compounds, with median similarity scores descending from 0.94 to 0.91. However, five compounds—cimipromidine (0.78), cimipromidine methyl ester (0.74), dopargine (0.77), and N-methylcytisine (0.797)—recorded the lowest similarity scores, aligning with their lower probability estimates of AMPK activation, as indicated in Figure 5. These findings, including individual similarity scores, are detailed in Table A2 in Appendix B, underscoring the data supporting the classification outcomes.

3.7. Comparison of Saponins with Their Aglycones

The 46 theoretical aglycones showed no systematic and significant differences in their probability compared with the saponins from which they were derived [31].
Saponins and their corresponding aglycones were analyzed for several markers indicative of oral bioavailabilities and drug-likeness (Figure 6). Data were applied to open source Webtool SwissADME [31], available at http://www.swissadme.ch, accessed on 4 April 2014.
As constructed, the molecular weight of aglycones was consistently lower than their corresponding saponins. While water solubility exhibited a significant decrease, on average (p = 0.02, paired two-sided t-test), compared with the solubility of saponins, there was a notable overlap between the two groups. In contrast, the topological polar surface area showed minimal overlap and a highly significant difference (p < 0.0001, paired two-sided t-test) between aglycones and saponins. An increase in lipophilicity, as indicated by the significant elevation of XLogP (p < 0.0001, paired two-sided t-test), was evident.
Assessing oral bioavailability using Lipinski’s rule of five [35], which indicates improved bioavailability if all five conditions are met, revealed significantly fewer violations for the aglycones (p = 0.01, Wilcoxon signed-rank test). Despite expectations that the observed effects on topological polar surface area (TPSA) and XLogP would manifest as clear differences in water solubility, the substantial overlap in solubility suggests that various physicochemical parameters exert opposing effects. This phenomenon cannot be solely explained by lipophilicity in a monocausal manner. Concerning drugability (lead-likeness), no clear advantage of the aglycones over the saponins could be demonstrated (p = 0.09, Wilcoxon signed-rank test).

4. Discussion

Herbal preparations encompass complex mixtures of potentially active chemical compounds. Nevertheless, comprehensive in vitro experiments often necessitate pure, isolated substances for each identified constituent. Regrettably, such isolated constituents are frequently insufficiently available. Hence, our extended approach uses machine learning tools, offering novel opportunities to screen these multi-substance preparations and identify promising lead compounds. These can then undergo rigorous subsequent testing.
Even when availability problems are set to one side, directly assessing each ingredient in vitro is a resource-intensive and time-consuming endeavor. A swifter, more cost-effective solution could be employing diverse machine learning models. These models, based on an established structure–activity database, can predict the AMPK activation potential of numerous so far uncharacterized substances “in a single run”.
All models investigated showed very good performance in discriminating AMPK activators from controls. Surprisingly, with the exception of three compounds (cyclocimipronidine, dopargine, and N-methylcytisine), all of the 95 investigated CR constituents were clearly predicted activators. It was therefore necessary to rule out a technical artifact caused by the overfitting of the model. Overfitting is a common problem in machine learning and statistical modeling, and it occurs when a model learns to perform very well on the training data but fails to generalize its predictions to new, yet unseen data. In other words, an overfitted model has focused on capturing the noise or random fluctuations in the training data instead of accurately capturing the underlying patterns or relationships.
A risk factor for overfitting is an unbalanced distribution of activators and controls in our database. This is an inherent problem in pharmacology. Due to the importance of AMPK activation, many potential activator compounds have been experimentally tested, whereas a much smaller number of negative controls (often inhibitors) have been investigated. This leads to a bias in the reported results within the literature.
In mitigating the challenge of overfitting, various methodological measures have been implemented to minimize this risk:
  • Balancing unevenly distributed dataset classes;
  • Employing simpler models;
  • Implementing cross-validation;
  • Utilizing regularization techniques;
  • Employing early stopping techniques.
All of these precautions were rigorously applied to ensure that technical and methodological safeguards had been implemented.
As we have previously demonstrated [12], the positive controls within our dataset, which serve as activators, exhibit a notable structural diversity. This diversity arises from the fact that a significant proportion of activators exert their effects indirectly. They interact with regulatory sites upstream in the biological pathways. When these sites are activated, they, in turn, trigger the phosphorylation and activation of AMP-activated protein kinase (AMPK). AMPK is a critical enzyme responsible for sensing and regulating energy supply, as well as various cellular functions. These functions include controlling carbohydrate entry and metabolism, generating reactive oxygen species (ROS), regulating apoptosis, modulating cellular growth, and influencing processes like mitochondrial biogenesis and autophagy.
While we achieved an excellent predictive performance on our unseen test dataset, it is important to acknowledge that the presence of unaccounted-for mechanisms cannot be ruled out. It is also worth noting that machine learning models have inherent limitations. They provide classification probabilities that ideally should be validated through direct in vitro or in vivo experiments or by other evidence. Another limitation is the research process itself. It focuses on AMPK activators rather than inhibitors or inactive substances. As a result, significantly fewer substances have been identified that inhibit AMPK, or, perhaps even more importantly, are confirmed not to interact with it. This leads to a selection bias in our database and unbalanced distribution and thus poses a theoretical risk of over-identifying active substances. This suggests that external evidence should also be sought.
A point that clearly supports the validity of the classifications is the calibration of the data, each consisting of 50 randomly selected positive and negative controls. Their classifications were clearly separated, with only 3% to 17% misclassifications across the three models under investigation. Another point to consider is the high sensitivity (84.5–87.2%) and specificity (84.1–91.4%), which provide strong indications for suitability as a screening tool.
To further substantiate our model’s predictive accuracy regarding the classification of the 95 CR constituents as either activators or controls, a comprehensive similarity analysis against all compounds in our database was performed. This involved computing the structural similarities of the CR constituents to every database entry and identifying the five most closely matching compounds for each metabolite (details provided in Table A2 in Appendix B). Notably, each of the CR constituents displayed considerable structural similarity to the positive control compounds within our database. The constituents showed high similarity to compounds in the database, with median similarity scores ranging from 0.94 down to 0.91. Nonetheless, a subset of compounds—specifically, cimipromidine (0.78), cimipromidine methyl ester (0.74), dopargine (0.77), and N-methylcytisine (0.797)—registered the lowest similarity scores. This correlates with their diminished likelihood of activating AMPK, as reflected in the probability estimates presented in Figure 4. These observations, including individual similarity scores, are thoroughly documented in Table A2 in Appendix B, providing a robust data foundation supporting our classification results.
Studying herbal drugs presents a unique set of challenges due to the complexity of herbal extracts, which consist of multiple substances. Additionally, obtaining pure substances from these extracts is often a challenging task, resulting in limited availability. Consequently, our improved method offers exciting new prospects for conducting thorough analyses of these complex mixtures. It enables the examination of multi-component herbal extracts to identify particular compounds of interest. Subsequently, these compounds can undergo more extensive assessments and evaluations, followed by further refinement of the extracts to enhance the concentration of the desired components.
Our results indicate that the models clearly classified all constituents of Cimicifuga racemosa as activators apart from three non-triterpenes. This suggests a high probability of their ability to activate AMPK. However, we cannot determine the strength of this activation from our findings. Moreover, it is plausible that this activation is a collaborative or even synergistic effect, considering that many constituents were classified as active. The overall effect is certainly influenced by the concentrations of these active compounds at the site of action, which is hard to predict.
It is perplexing that the models made no distinction between triterpene saponins and their aglycones in terms of the probability of classifying the compounds as activators. Although it is conceivable that aglycones, due to their higher lipophilicity, have a greater likelihood of being absorbed into tissues and reaching the site of action [40], our model merely predicts whether the compounds are capable of activating AMPK at all. It does not take into account the dose–response relationship and kinetics.
Triterpene saponins, known for their high hydrophilicity, exhibit limited oral absorption from the gastrointestinal tract, especially when compared to their respective lipophilic aglycones (for a review, see [40]). In our experiments, the range of water solubility values of CR triterpene saponins significantly overlapped the range of the values of their corresponding aglycones, suggesting that this statement likely needs to be assessed individually for each saponin and aglycone. Consequently, it is difficult to predict the overall oral absorption of a multicomponent mixture as an herbal extract.
In current Cimicifuga racemosa extracts, the aglycone content is relatively low. Nevertheless, research has demonstrated that a significant portion of the dose of triterpene saponin, as observed with 23-epi-26 dihydroxyactein, is orally absorbed in both rats [41] and humans [42]. Nonetheless, following oral administration, certain triterpene saponins have the potential to reach the large intestine, where they might undergo degradation by the colonic microbiome. This process, similar to what has been observed for other triterpene saponins [40]), could also contribute to the overall effect.
This study has some limitations: While MACCS (Molecular Access System) descriptors are widely utilized in cheminformatics and machine learning for representing chemical compounds [23], it is essential to acknowledge their inherent limitations and potential biases. Being rooted in predefined substructures, there is a possibility of bias towards specific compound types or functional groups, potentially overlooking less common or innovative structural motifs. The reliance on a fixed set of molecular features may impede the generalizability of machine learning models across diverse chemical datasets. Furthermore, some MACCS descriptors may exhibit high correlation or redundancy, leading to multicollinearity in the feature space. Addressing such issues is crucial as it can impact the stability and interpretability of machine learning models, often necessitating feature selection or dimensionality reduction techniques, as we applied in our study.
Moreover, MACCS descriptors are primarily tailored for small organic molecules and may not adequately represent complex biomolecules or materials. Hence, to ensure compatibility with the descriptor’s scope, we constrained our dataset to small compounds (molecular weight ≤ 1000).
A PubMed search using the terms “AMPK” and “QSAR” reveals that various QSAR models for predicting AMPK activation have been documented [43,44]. These models predominantly rely on pharmacophore docking, homology modeling, and structure-, ligand-, or fragment-based design strategies, focusing solely on compounds that activate AMPK directly. Diverging from these methodologies, our research appears to be the first to comprehensively incorporate compounds that activate AMPK, regardless of whether the activation is direct or indirect. This inclusive approach enables a broader understanding and captures the diverse mechanisms of AMPK activation more effectively, addressing the enzyme’s activation heterogeneity.

5. Conclusions

The results of this study confirm that all triterpene saponins, as well as their aglycones, tested may contribute to activating the AMP-dependent protein kinase (AMPK). With regard to the mechanism, this may suggest a collaborative or even synergistic action on the enzyme. Since AMPK plays a pivotal role in various interconnected metabolic pathways, our results further underscore the rationale for clinically investigating the therapeutic benefits of Cimicifuga racemosa extracts in conditions associated with disturbances in these metabolic pathways.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/pharmaceutics16040511/s1; Table S1: DB_descript_MACCS; Table S2: Cimi_descript_MACCS; Table S3. Experiments.

Author Contributions

Conceptualization, J.D., V.S. and G.B.; methodology, V.S.; software and validation, J.D., V.S. and O.D.; formal analysis, J.D.; data curation, J.D.; writing—original draft preparation, J.D.; writing—review and editing, V.S., G.B., O.D. and A.S.; supervision, J.D. and G.B.; funding acquisition, G.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable, since the studies did not involve humans or animals.

Informed Consent Statement

Not applicable, since the studies did not involve humans or animals.

Data Availability Statement

A complete list of used activators and controls is given in Supplementary Materials as Tables S1–S3, and source codes of all models are given in Table A1 and Table A2.

Conflicts of Interest

J.D., O.D., A.S. and G.B. work at Max Zeller Söhne AG, a phytopharmaceutical company. V.S. declares no conflicts of interest. The design of this study was the sole responsibility of the authors. The funders/company had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. Details of the Deep Neural Networks Model

  • Details of the Deep Neural Networks model;
  • Python code: Model.ipynb;
  • Database: database.csv.
from sklearn.model_selection import KFold
from sklearn.metrics import make_scorer, accuracy_score
from keras.models import Sequential
from keras.callbacks import ModelCheckpoint
from keras.models import load_model
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score
n_features = X_train.shape [1]
n_targets = 1
learning_rate = 0.01
n_hidden = 4
batch_size = 32
epochs = 10
def  create_model (n_features: int, learning_rate: float, n_hidden: int, batch_size: int,
        dropout: float) -> Sequential:inputs = Input (shape = (number of features))
x = Dense (1_500, kernel_initializer = init_w, bias_initializer = init_b) (inputs)
   x = Activation (“elu”)(x)
   x = Dropout (dropout)(x)
   for i in range (0,n_hidden):
        (x) = Dense (1_500-i*300, kernel_initializer = init_w, bias_initializer = init_b) (x)
        (x) = Activation (“elu”) (x)
        (x) = Dropout (dropout) (x)
   outputs = Dense (n_targets, activation = “sigmoid”) (x)
model = Model (inputs = inputs, outputs = outputs)
model.compile (loss = ‘binary_crossentropy’, optimizer = Adam(learning_rate = learning_rate), metrics = [‘accuracy’])
return model

Appendix B

Table A1. Major constituents of Cimicifuga racemosa extracts.
Table A1. Major constituents of Cimicifuga racemosa extracts.
Shengmanol type (16-ketone type)
Pharmaceutics 16 00511 i001
CompoundsR1R2R3R4R5R6Δ7,8CID
23-O-AcetylshengmanolOHHHOHO-AcEpoxy-91827092
23-O-Acetylshengmanol-3-
O-β-D-xylopyranoside
O-XylHHOHO-AcEpoxy-56962372
23-O-Acetylshengmanol-3-O-α-L-
arabinopyranoside
O-AraHHOHO-AcEpoxy-10865257
Bugbanoside CO-AraHO-AcOH=OOH/OH+15894670
Bugbanoside DO-AraHO-AcOH=OEpoxy+15894671
Bugbanoside EO-AraHO-AcH=OEpoxy+15894672
Cimicifugoside H-1O-XylOHHH=OEpoxy+15241163
Cimicifugoside H-3O-XylOHHH=OCH2OH+15241164
Cimiracemoside L4′-O-Ac-XylHHOHO-AcEpoxy-10952624
CimicidanolOHOHHH=OEpoxy+10413064
Hydroxyshengmanol type
Pharmaceutics 16 00511 i002
CompoundsR1R2R3R4R5 162324CID
24-Acetylhydroshengmanol-3-O-β-D-
xylopyranoside
O-XylOHOHCH3O-Ac/OHSSS157168
Cimiracemoside EO-Xyl=OHCH2OHO-Ac/OHRRS91827210
ShengmanolOHOHOHCH3Epoxy SRS101133349
Shengmanol-3-O-β-D-xylopyranosideO-XylOHOHCH3Epoxy SRS158275
Cimigenol type (A)
Pharmaceutics 16 00511 i003
CompoundsR1R2R3R4Δ7,81524CID
CimigenolOHHCH3OH-RS16020000
CimigolOHHCH3OH-SR101596828
25-O-AcetylcimigenolOHHCH3O-Ac-RS46881255
25-O-Acetylcimigenol 3-O-α-L-
arabinopyranoside
O-XylHCH3O-Ac-RS24721386
25-O-MethylcimigenolOHHCH3O-CH3-RS146027510
25-O-Methylcimigenol-3-O-β-D-
xylopyranoside
O-XylHCH3O-CH3-RS146027510
25-O-Ethylcimigenol-3-O-β-D-
xylopyranoside
O-XylHCH3O-CH2CH3-RS16091662
12-β-Acetoxycimigenol OHO-AcCH3OH-RS16104912
12-β-Acetylcimigenol-3-O-β-D-
xylopyranoside
O-XylO-AcCH3OH-RS44418831
12-β-HydroxycimigenolOHOHCH3OH-RS10006332
Bugbanoside FO-AraOHCH3OH+RS101096469
Cimiracemoside BO-XylHCH2OHOH-RS91826883
Cimiracemoside C (=Cimifugoside M)O-AraHCH3OH-RS15541911
Cimiracemoside DO-AraO-AcCH3OH-RS70698290
Cimiside AO-XylOHCH3OH-RS91827183
Cimiside B3′-O-Xyl-3-O-XylHCH3OH-RS10054869
Cimigenol type (B)
Pharmaceutics 16 00511 i004
CompoundsR1R224CID
25-AnhydroCimigenol-3-O-α-L-
arabinopyranoside
O-AraHR70698285
Cimiracemoside JO-AraO-AcS10952455
Cimiracemoside KO-XylO-AcS10930352
Cimiside EO-XylHS102147078
Acteol type
Pharmaceutics 16 00511 i005
CompoundsR1R2R3Δ7,82425CID
ActeolOHOHOH-SR59595161
ActeinO-XylO-AcOH-RS10032468
23-Epi-26-deoxyacteinO-XylO-AcH-RR10974362
Cimiracemoside NO-AraO-AcH-SS21591918
Cimiracemoside PO-XylO-Ac=O-SR91827183
12-O-AcetylacteolOHO-AcOH-SS23640137
Cimiracemoside type
Pharmaceutics 16 00511 i006
CompoundsR1Δ7,8CID
Cimiracemoside A (=F)O-Xyl+21606551
Cimiracemoside HO-Xyl-21606553
Neocimigenoside type
Pharmaceutics 16 00511 i007
CompoundsR1CID
Neocimicigenoside AO-Ara44583839
Neocimicigenoside BO-Xyl44583840
Cimilactone type
Pharmaceutics 16 00511 i008
CompoundsR1Δ7,8CID
Cimilactone AO-Xyl-10908062
Podocarpaside type
Pharmaceutics 16 00511 i009
CompoundsR1R2R3R4R5R6Δ5,6Δ5,11Δ10,11511CID
Podocarpaside AO-AraOH-HHH-+---16110015
Podocarpaside BO-AraHHOHHH---RS16110011
Podocarpaside CO-AraHHOHHOH---RS16110016
Podocarpaside DO-AraHOHHHH---SS16110012
Podocarpaside EO-AraH-OHOHOH-+---139071967
Podocarpaside FO-AraH--HOH--+R-16110017
Podocarpaside GO-AraH---OH+-+--16110014
Podocarpaside type
Pharmaceutics 16 00511 i010
CompoundsR1R2R3R4R5R6Δ5,6Δ5,11Δ10,11511CID
Podocarpaside AO-AraOH-HHH-+---16110015
Podocarpaside BO-AraHHOHHH---RS16110011
Podocarpaside CO-AraHHOHHOH---RS16110016
Podocarpaside DO-AraHOHHHH---SS16110012
Podocarpaside EO-AraH-OHOHOH-+---139071967
Podocarpaside FO-AraH--HOH--+R-16110017
Podocarpaside GO-AraH---OH+-+--16110014
Acerinol type
Pharmaceutics 16 00511 i011
CompoundsR1CID
24-O-AcetylacerionolO-Ac101596791
AcerinolOH73347277
Cimicinol
Pharmaceutics 16 00511 i012
CompoundCID
Cimicinol102146755
Actaeaepoxide-3-O-beta-D-xylopyranoside
Pharmaceutics 16 00511 i013
CompoundCID
Actaeaepoxide-3-O--D-xylopyranoside15515494
Friedelin
Pharmaceutics 16 00511 i014
CompoundCID
Friedelin91472
Cinnamic acid derivatives
Pharmaceutics 16 00511 i015
CompoundsR1R2R3CID
Sinapic acidO-MeOHO-Me637775
p-Coumaric acidHOHH637542
Isoferulic acidOHO-MeH736186
3,4-Dimethoxycinnamic acidO-MeO-MeH717531
Cimiracemate type
Pharmaceutics 16 00511 i016
CompoundsR1R2R3CID
Cimiracemate AOHO-MeH5315874
Cimiracemate BO-MeOHH5315876
Cimiracemate COHO-MeO-Me *5315877
Cimiracemate DO-MeOHO-Me *5315878
* Stereochemistry not known.
Cimiciphenone type
Pharmaceutics 16 00511 i017
CompoundsRCID
CimiciphenoneO-Me71487912
PetasiphenoneOH16066851
Protocatechuic acid type
Pharmaceutics 16 00511 i018
CompoundsR1CID
Protocatechuic acidOH72
Protocatechuic aldehydeH637542
Fukiic acid derivatives
Pharmaceutics 16 00511 i019
CompoundsR1CID
Fukiic acidOH161871
Piscidic acidH120693
Cimicifugic acid derivatives
Pharmaceutics 16 00511 i020
CompoundsR1R2R3R4CID
Cimicifugic acid A (2-Feruloyl fukinolic acid)OHOHO-MeOH6449879
Cimicifugic acid B (2-Isoferuloyl piscidic acid)OHOHOHO-Me6449880
Cimicifugic acid C (2-p Coumaric fukinolic acid)OHOHHOH6401178
Cimicifugic acid D (2-Caffeoyl piscidic acid)OHHOHOH11742743
Cimicifugic acid E (2-Feruloyl piscidic acid)OHHO-MeOH10002902
Cimicifugic acid F (2-Isoferuloyl piscidic acid)OHHOHO-Me6450179
Cimicifugic acid G (2-Feruloyl piscidic acid)OHOHO-MeO-Me11655574
Fukinolic acidOHOHOHOH6441059
Actealactone
Pharmaceutics 16 00511 i021
CompoundCID
Actealactone11537736
Astilbin
Pharmaceutics 16 00511 i022
CompoundCID
Astilbin119258
Chromones
Pharmaceutics 16 00511 i023
CompoundR1R2R3Δ7a−7b7bCID
AngelicainCH2-OHpropan-2-olOH-S46240156
CimifuginO-Mepropan-2-olO-Me-S4411960
VisnaginMeHO-Me+-6716
Cimidahurine
Pharmaceutics 16 00511 i024
CompoundCID
Cimidahurine5315870
Cimipronidine
Pharmaceutics 16 00511 i025
CompoundsR1CID
CimipronidineOH21594000
Cimipronidine methylesterO-Me101467166
Dopargine
Pharmaceutics 16 00511 i026
CompoundCID
Dopargine10357001
Cyclocimipronidine
Pharmaceutics 16 00511 i027
CompoundCID
Cyclocimipronidine101467165
N-Methycytisine
Pharmaceutics 16 00511 i028
CompoundCID
N-Methycytisine670971
Table A2. Support of classification: similarity of Cimicifuga constituents to database elements.
Table A2. Support of classification: similarity of Cimicifuga constituents to database elements.
(Cosine-Similarity Score)
NoGeneric NameTop 1ScoreTop 2SoreTop 3ScoreTop 4ScoreTop 5Score
Cimi_112-beta-Acetoxy-CimigenolDMAT0.9022-Hydroxyestradiol0.899CHEMBL33931330.899CHEMBL31337620.899CHEMBL1967590.899
Cimi_212-beta-Acetyl-Cimigenol-3-O-beta-D-xylopyranosideCHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.9512-Hydroxyestradiol0.951Ezetimibe0.940
Cimi_312-beta-Hydroxy-CimigenolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_412-O-AcetylacteolDMAT0.903CHEMBL31337620.900CHEMBL1967590.900CHEMBL33931330.9002-Hydroxyestradiol0.900
Cimi_515-O-Methyl-CimigenolCHEMBL33931330.8992-Hydroxyestradiol0.899CHEMBL31337620.899CHEMBL1967590.899CHEMBL33611280.887
Cimi_623-epi-26-Deoxyactein=27-DeoxyacteinCHEMBL33931330.952CHEMBL1967590.952CHEMBL31337620.9522-Hydroxyestradiol0.952CHEMBL23259010.931
Cimi_723-O-AcetylshengmanolCompound C20.909CHEMBL20172140.899CHEMBL3832460.895CHEMBL39634440.8946 Paradol0.894
Cimi_823-O-Acetylshengmanol 3-O-beta-D-xylopyranosideCHEMBL23259010.951CHEMBL3719680.934CHEMBL24208990.934Polydatin0.934Teneligliptin0.934
Cimi_923-O-Acetylshengmanol xylosideCHEMBL23259010.951CHEMBL3719680.934CHEMBL24208990.934Polydatin0.934Teneligliptin0.934
Cimi_1024-Acetylhydroshengmanol xylosideCHEMBL1967590.941CHEMBL33931330.941CHEMBL31337620.9412-Hydroxyestradiol0.941CHEMBL23259010.939
Cimi_1124-O-AcetylacerionolCHEMBL37462930.923GW275944X0.921Theasinensis A0.917CHEMBL10786650.901Mogrol0.897
Cimi_1225-AnhydroCimigenol-3-O-alpha-L-arabinoside6-O-Cinnamoyl-D-glucopyranose0.952GW290597X0.952GW458787A0.952delphinidin-3-glucoside0.952Ezetimibe0.952
Cimi_1325-O-AcetylCimigenolCHEMBL3832460.901CHEMBL39634440.9006 Paradol0.900CHEMBL41120130.900Ascofuranone0.878
Cimi_1425-O-AcetylCimigenol 3-o-alpha-L-arabinosideCHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.9512-Hydroxyestradiol0.951Ezetimibe0.940
Cimi_1525-O-Acetyl-cimigenol xylosideCHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.9512-Hydroxyestradiol0.951Ezetimibe0.940
Cimi_1625-O-Ethyl-cimigenol-3-O-beta-D-xylopyranoside2-Hydroxyestradiol0.951CHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.951GYY41370.928
Cimi_1725-O-Methyl-cimigenolCHEMBL33931330.8992-Hydroxyestradiol0.899CHEMBL31337620.899CHEMBL1967590.899CHEMBL33611280.887
Cimi_1825-O-Methyl-cimigenol-3-O-beta-D-xyloside2-Hydroxyestradiol0.951CHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.951GYY41370.928
Cimi_203,4-Dimethoxycinnamic acid4a-Isoalantolactone0.958Nootkatone0.958Gemcitabine0.957Fenoldopam0.936GW439255X0.913
Cimi_22AcerinolTheasinensis A0.917GW275944X0.895Folic caid0.8926-O-cinnamoyl-D-glucopyranose0.890Karaviloside X0.886
Cimi_23Actaeaepoxide 3-O-beta-D-xylopyranosideGSK978744A0.945CHEMBL23382310.943LCZ6960.943Tadalafil0.943Berteroin0.943
Cimi_24ActaealactoneC1290.859GW644007X0.827Clozapin0.827CHEMBL10816780.813CHEMBL40925080.812
Cimi_25ActeinCHEMBL1967590.941CHEMBL31337620.9412-Hydroxyestradiol0.941CHEMBL33931330.941Ezetimibe0.931
Cimi_26ActeolDMAT0.902CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899CHEMBL1967590.899
Cimi_27AngelicainGW644007X0.899CHEMBL37309160.864CHEMBL23382290.857CHEMBL2044200.838CHEMBL42171990.834
Cimi_28Bugbanoside ECHEMBL24208990.954Teneligliptin0.954Polydatin0.954CHEMBL3719680.954Cinacalcet0.954
Cimi_29Bugbanoside FFolic acid0.961GSK978744A0.954GW290597X0.952SC40.9526-O-cinnamoyl-D-glucopyranose0.952
Cimi_30Caffeic acid3-O-methylquercetin1.000CHEMBL2082860.977CHEMBL24082320.9344a-Isoalantolactone0.914Nootkatone0.914
Cimi_31Caffeic methyl esterNootkatone0.9804a-Isoalantolactone0.980Fenoldopam0.9593-O-methylquercetin0.938CHEMBL2082860.917
Cimi_32Cimicfugoside MCHEMBL1967590.9612-Hydroxyestradiol0.961CHEMBL31337620.961CHEMBL33931330.961GYY41370.938
Cimi_33CimicidanolGamma linolenic acid0.924Fluvastatin0.923CHEMBL10786650.913CHEMBL41141200.912Compound C20.912
Cimi_34Cimicifugic acid CGlyceolin0.957GW780056X0.926Oligomycin0.926Ibuprofen0.926CHEMBL40925080.878
Cimi_35Cimicifugic acid DGlyceolin0.957GW780056X0.926Oligomycin0.926Ibuprofen0.926CHEMBL40925080.878
Cimi_36Cimicifugic acid EGlyceolin0.930GW780056X0.900Oligomycin0.900Ibuprofen0.900Melatonin0.885
Cimi_37Cimicifugic acid FGlyceolin0.930GW780056X0.900Oligomycin0.900Ibuprofen0.900Melatonin0.885
Cimi_38Cimicifugoside H-1CHEMBL24208990.954Teneligliptin0.954Polydatin0.954CHEMBL3719680.954Cinacalcet0.954
Cimi_39Cimicifugoside H-2CHEMBL12301710.962Meriolin 10.962CHEMBL24208990.953Polydatin0.953Cinacalcet0.953
Cimi_40Cimicifugoside H-3CHEMBL12301710.982Polydatin0.973Teneligliptin0.973CHEMBL3719680.973GW576924A0.973
Cimi_41CimicinolFolic acid0.939GSK978744A0.934SC40.932GW290597X0.9326-O-cinnamoyl-D-glucopyranose0.932
Cimi_42CimiciphenoneParoxetine0.969CHEMBL41127410.921Melatonin0.919CHEMBL40666280.904Ibuprofen0.904
Cimi_43CimidahurineGamma-oryzanol0.973Atractylenolide III0.960Sirtinol0.937Zidovudine0.933Compound 590.926
Cimi_44CimifuginCHEMBL39300060.858GW644007X0.858GW708336X0.852Palbociclib0.849CHEMBL2044200.844
Cimi_45CimigenolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_46Cimigenol xylosideCHEMBL1967590.9612-Hydroxyestradiol0.961CHEMBL31337620.961CHEMBL33931330.961GYY41370.938
Cimi_47CimigolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_48Cimilactone BGW576924A0.953Teneligliptin0.953CHEMBL24208990.953polydatin0.953Cinacalcet0.953
Cimi_49CimipronidineCHEMBL19332790.778Cheletyrine0.764Pinosylvin0.758Nordihydroguaiaretic acid0.755CHEMBL37301460.723
Cimi_50Cimipronidine methyl esterCHEMBL19332790.743Pinosylvin0.741GSK182497A0.713Cheletyrine0.710BDE-2090.705
Cimi_51Cimiracemate AParoxetine0.954Oligomycin0.922GW780056X0.922Ibuprofen0.922CHEMBL41127410.907
Cimi_52Cimiracemate BParoxetine0.954Oligomycin0.922GW780056X0.922Ibuprofen0.922CHEMBL41127410.907
Cimi_53Cimiracemate CParoxetine0.956CHEMBL40666280.926Melatonin0.910CHEMBL41127410.880CHEMBL40925080.878
Cimi_54Cimiracemate DParoxetine0.956CHEMBL40666280.926Melatonin0.910CHEMBL41127410.880CHEMBL40925080.878
Cimi_55Cimiracemoside A (=F)GSK978744A0.963Ezetimibe0.962Monensin0.952Folic acid0.952GSK192082A0.945
Cimi_56Cimiracemoside BCHEMBL33931330.990CHEMBL31337620.990CHEMBL1967590.9902-Hydroxyestradiol0.990GYY41370.970
Cimi_57Cimiracemoside CCHEMBL1967590.9612-Hydroxyestradiol0.961CHEMBL31337620.961CHEMBL33931330.961GYY41370.938
Cimi_58Cimiracemoside DCHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.9512-Hydroxyestradiol0.951Ezetimibe0.940
Cimi_59Cimiracemoside ECHEMBL23259010.981Cinacalcet0.962Teneligliptin0.962Polydatin0.962CHEMBL3719680.962
Cimi_60Cimiracemoside GGSK978744A0.963Ezetimibe0.962Monensin0.952Folic acid0.952GSK192082A0.945
Cimi_61Cimiracemoside HCHEMBL33931330.951CHEMBL31337620.951CHEMBL1967590.9512-Hydroxyestradiol0.951Ezetimibe0.940
Cimi_62Cimiracemoside JEzetimibe0.962Monensin0.952CHEMBL39750110.952CHEMBL42460000.952GSK978744A0.945
Cimi_63Cimiracemoside KEzetimibe0.962Monensin0.952CHEMBL39750110.952CHEMBL42460000.952GSK978744A0.945
Cimi_64Cimiracemoside LCHEMBL23259010.929CDN11630.918Cinacalcet0.914CHEMBL24208990.914polydatin0.914
Cimi_65Cimiracemoside NCHEMBL33931330.952CHEMBL1967590.952CHEMBL31337620.9522-Hydroxyestradiol0.952CHEMBL23259010.931
Cimi_66Cimiracemoside PCHEMBL31337620.932CHEMBL1967590.9322-Hydroxyestradiol0.932CHEMBL33931330.932CHEMBL23259010.931
Cimi_67Cimiside ACHEMBL1967590.9612-Hydroxyestradiol0.961CHEMBL31337620.961CHEMBL33931330.961GYY41370.938
Cimi_68Cimiside BCHEMBL1967590.9712-Hydroxyestradiol0.971CHEMBL31337620.971CHEMBL33931330.971GYY41370.949
Cimi_69Cimiside E6-O-Cinnamoyl-D-glucopyranose0.952GW290597X0.952GW458787A0.952delphinidin-3-glucoside0.952Ezetimibe0.952
Cimi_70CyclocmipronidineCheletyrine0.80115,16-dihydrotanshinone I0.800CHEMBL37309330.800Nordihydroguaiaretic acid0.791CHEMBL1882820.757
Cimi_71DahurinolCHEMBL3832460.901CHEMBL40940800.8972G110.897Gamma linolenic acid0.884CHEMBL37358900.884
Cimi_72DopargineTangeretin0.768GSK182497A0.751SC-2026710.748CHEMBL39274650.748Oleic acid0.747
Cimi_73Ferulic acid methyl ester4a-Isoalantolactone0.958Nootkatone0.958Gemcitabine0.957Fenoldopam0.936GW439255X0.913
Cimi_74FormononetinCHEMBL37746321.000SB-4095140.984PP4870.969CHEMBL33931310.969CHEMBL2076740.969
Cimi_75FriedelinProcyanidin B20.891CHEMBL41120130.857GSK635416A0.857CHEMBL37278650.850CHEMBL38592680.840
Cimi_76Fukiic acidGlyceolin0.924Adenine0.861CHEMBL24082320.861Icaritin0.859Oligomycin0.853
Cimi_77Fukinolic acidGlyceolin0.957GW780056X0.926Oligomycin0.926Ibuprofen0.926CHEMBL40925080.878
Cimi_78IsoCimicifugamideCompound 590.869Nummularic acid0.858Sirtinol0.858Bupivacaine0.849CHEMBL39313500.837
Cimi_79Isoferulic acidNootkatone1.0004a-Isoalantolactone1.000fenoldopam0.9793-O-methylquercetin0.914Gemcitabine0.914
Cimi_80Neocimicigenoside ACHEMBL1967590.941CHEMBL33931330.941CHEMBL31337620.9412-Hydroxyestradiol0.941CHEMBL23259010.939
Cimi_81Neocimicigenoside BCHEMBL1967590.941CHEMBL33931330.941CHEMBL31337620.9412-Hydroxyestradiol0.941CHEMBL23259010.939
Cimi_82N-MethylcytisineTBB0.797CHEMBL39670750.784Soyasapogenol C0.780Momordicoside Q0.780GW827396X0.738
Cimi_83p-Coumaric acidSulforaphane1.0003-O-Methylquercetin0.879GW782612X0.868CHEMBL37363200.849CHEMBL2082860.847
Cimi_84PetasiphenoneCHEMBL41127410.951Paroxetine0.936Ibuprofen0.933GW780056X0.933Oligomycin0.933
Cimi_85Piscidic acidGlyceolin0.892Adenine0.891Icaritin0.831Ibuprofen0.814Oligomycin0.814
Cimi_86Podocarpaside A CHEMBL12301710.962Meriolin0.962CHEMBL24208990.953Polydatin0.953Cinacalcet0.953
Cimi_87Podocarpaside B CHEMBL12301710.962Meriolin0.962CHEMBL24208990.953Polydatin0.953Cinacalcet0.953
Cimi_88Podocarpaside CCHEMBL12301710.962Meriolin0.962CHEMBL24208990.953Polydatin0.953Cinacalcet0.953
Cimi_89Podocarpaside DCHEMBL12301710.962Meriolin0.962CHEMBL24208990.953Polydatin0.953Cinacalcet0.953
Cimi_90Podocarpaside FCHEMBL12301710.962Meriolin0.962CHEMBL24208990.953Polydatin0.953Cinacalcet0.953
Cimi_91Podocarpaside G CHEMBL12301710.952Meriolin0.952Teneligliptin0.944Polydatin0.944CHEMBL3719680.944
Cimi_92ProtocatechualdehydeCHEMBL2082860.9513-O-Methylquercetin0.929Belinostat0.923CHEMBL24082320.909GW782612X0.872
Cimi_93Protocatechuic acidCHEMBL2082861.0003-O-Methylquercetin0.977CHEMBL24082320.956Fenoldopam0.910Nootkatone0.891
Cimi_94ShengmanolCHEMBL1967590.910DMAT0.892CHEMBL31337620.889CHEMBL33931330.8892-Hydroxyestradiol0.889
Cimi_95Shengmanol xylosideCHEMBL1967590.951CHEMBL31337620.9312-Hydroxyestradiol0.931CHEMBL33931330.931Ezetimibe0.920
Cimi_96Sinapic acidNootkatone0.9614a-Isoalantolactone0.961Fenoldopam0.941Gemcitabine0.920CHEMBL40666280.895
Cimi_97VisnaginPrednisolone0.938CHEMBL39766460.889Rifampicin0.889CHEMBL2081180.889Monascus 0.889
Cimi_2_
metab
12-beta-AcetoxycimigenolDMAT0.9022-Hydroxyestradiol0.899CHEMBL33931330.899CHEMBL31337620.899CHEMBL1967590.899
Cimi_6_
metab
Cimi_6_metabCHEMBL33931330.910CHEMBL1967590.9102-Hydroxyestradiol0.910CHEMBL31337620.910Monensin 0.908
Cimi_8_
metab
23-O-AcetylshengmanolCompound C20.909CHEMBL20172140.899CHEMBL3832460.895CHEMBL39634440.8946 Paradol0.894
Cimi_9_
metab
Cimi_9_metabCompound C20.909CHEMBL20172140.899CHEMBL3832460.895CHEMBL39634440.8946 Paradol0.894
Cimi_10_
metab
Cimi_10_metabCHEMBL1967590.8992-Hydroxyestradiol0.899CHEMBL31337620.899CHEMBL33931330.899CHEMBL40940800.899
Cimi_12_
metab
Cimi_12_metabAscofuranone0.919AKOS0078659320.907CHEMBL42460000.907CHEMBL39750110.907delphinidin-3-glucoside0.898
Cimi_14_
metab
Cimi_14_metabDMAT0.9022-Hydroxyestradiol0.899CHEMBL33931330.899CHEMBL31337620.899CHEMBL1967590.899
Cimi_15_
metab
Cimi_15_metabDMAT0.9022-Hydroxyestradiol0.899CHEMBL33931330.899CHEMBL31337620.899CHEMBL1967590.899
Cimi_16_
metab
25- O-MethylcimigenolCHEMBL33931330.8992-Hydroxyestradiol0.899CHEMBL31337620.899CHEMBL1967590.899CHEMBL33611280.887
Cimi_18_
metab
25-O-MethylcimigenolCHEMBL33931330.8992-Hydroxyestradiol0.899CHEMBL31337620.899CHEMBL1967590.899CHEMBL33611280.887
Cimi_19_
metab
Cimi_19_metabDMAT0.903CHEMBL31337620.900CHEMBL1967590.900CHEMBL33931330.9002-Hydroxyestradiol0.900
Cimi_22_
metab
Cimi_22_metabMonensin0.909GSK978744A0.905CHEMBL23382310.903Berteroin0.903LCZ6960.903
Cimi_23_
metab
Cimi_23_metabMonensin0.909GSK978744A0.905CHEMBL23382310.903Berteroin0.903LCZ6960.903
Cimi_25_
metab
Cimi_25_metabDMAT0.903CHEMBL31337620.900CHEMBL1967590.900CHEMBL33931330.9002-Hydroxyestradiol0.900
Cimi_28_
metab
Cimi_28_metabCHEMBL37462930.925CHEMBL10786650.925Compound C20.924CHEMBL41141200.902GW275944X0.902
Cimi_29_
metab
Cimi_29_metabCHEMBL23761440.899CHEMBL20419620.898CHEMBL34271840.898Epiberberine0.898GW275944X0.898
Cimi_32_
metab
Cimi_32_metabDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_38_
metab
CimicidanolGamma linolenic acid0.924Fluvastatin0.923CHEMBL10786650.913CHEMBL41141200.912Compound C20.912
Cimi_39_
metab
Cimi_39_metabGamma linolenic acid0.966Fluvastatin0.966CHEMBL23377670.955Epiberberine0.955CHEMBL10786650.933
Cimi_40_
metab
Cimi_40_metabC1280.956Pitavastatin0.955Xanthohumol0.928Fluvastatin0.920CHEMBL41141200.909
Cimi_41_
metab
Cimi_41_metabGW631581B0.894Theasinensis A0.884Compound C20.884GW275944X0.881Folic acid0.880
Cimi_43_
metab
HydroxytyrosolCHEMBL12338811.000CHEMBL41127410.830CHEMBL39092860.823gamma-oryzanol0.802CHEMBL42155720.793
Cimi_46_
metab
CimigenolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_56_
metab
Cimi_56_metabCHEMBL33931330.9412-Hydroxyestradiol0.941CHEMBL31337620.941CHEMBL1967590.941CHEMBL42787630.917
Cimi_57_
metab
CimigenolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_58_
metab
12beta-acetoxycimigenolDMAT0.9022-Hydroxyestradiol0.899CHEMBL33931330.899CHEMBL31337620.899CHEMBL1967590.899
Cimi_59_
metab
Cimi_59_metabCHEMBL23259010.939CHEMBL24208990.923CHEMBL3719680.923Teneligliptin0.923Polydatin0.923
Cimi_61_
metab
Cimi_61_metabCHEMBL33931330.920CHEMBL31337620.920CHEMBL1967590.9202-Hydroxyestradiol0.920Monensin 0.917
Cimi_62_
metab
Cimi_62_metabAscofuranone0.923AKOS0078659320.911Monensin 0.908CHEMBL39750110.908CHEMBL42460000.908
Cimi_63_
metab
Cimi_63_metabAscofuranone0.923AKOS0078659320.911Monensin 0.908CHEMBL39750110.908CHEMBL42460000.908
Cimi_64_
metab
23-O-AcetylshengmanolCompound C20.909CHEMBL20172140.899CHEMBL3832460.895CHEMBL39634440.8946 Paradol0.894
Cimi_65_
metab
Cimi_65_metabCHEMBL33931330.910CHEMBL1967590.9102-Hydroxyestradiol0.910CHEMBL31337620.910Monensin 0.908
Cimi_66_
metab
Cimi_66_metabCHEMBL20172140.911DMAT0.903CHEMBL37358900.889CHEMBL3832460.885CHEMBL33931330.879
Cimi_67_
metab
12beta-HydroxycimigenolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_68_
metab
CimigenolDMAT0.900CHEMBL1967590.899CHEMBL31337620.8992-Hydroxyestradiol0.899CHEMBL33931330.899
Cimi_69_
metab
Cimi_69_metabAscofuranone0.919AKOS0078659320.907CHEMBL42460000.907CHEMBL39750110.907delphinidin-3-glucoside0.898
Cimi_78_
metab
Cimi_78_metabSB-7329410.924CHEMBL39332510.875CHEMBL37281280.870Corosolic acid0.852Hernandezine0.800
Cimi_80_
metab
Cimi_80_metabUrolithin A0.907CHEMBL20172140.897CHEMBL3832460.891DMAT0.890CHEMBL39634440.890
Cimi_81_
metab
Cimi_81_metabUrolithin A0.907CHEMBL20172140.897CHEMBL3832460.891DMAT0.890CHEMBL39634440.890
Cimi_86_
metab
Cimi_86_metabGamma linolenic acid0.955CHEMBL10786650.944CHEMBL41141200.943GW780159X0.942Crocin0.941
Cimi_87_
metab
Cimi_87_metabCHEMBL41141200.953Crocin0.952Gamma linolenic acid0.943Fluvastatin0.941Fucoxanthin0.940
Cimi_88_
metab
Cimi_88_metabGamma linolenic acid0.955Fluvastatin0.954CHEMBL10786650.944CHEMBL41141200.943Epiberberine0.942
Cimi_89_
metab
Cimi_89_metabCHEMBL41141200.953Crocin0.952Gamma linolenic acid0.943Fluvastatin0.941Fucoxanthin0.940
Cimi_90_
metab
Cimi_90_metabCHEMBL12301710.962Meriolin 1 0.962CHEMBL24208990.953polydatin0.953Cinacalcet0.953
Cimi_91_
metab
Cimi_91_metabGamma linolenic acid0.943Fluvastatin0.941CHEMBL10786650.932CHEMBL41141200.930Epiberberine0.929
Cimi_92_
metab
Cimi_92_metabCHEMBL1967590.910DMAT0.892CHEMBL31337620.889CHEMBL33931330.8892-Hydroxyestradiol0.889
Median 0.941 0.923 0.922 0.912 0.907
Highlighted in blue are the constituents for which no comparison molecule was found in the database with a cosine similarity score > 0.8.

References

  1. Wuttke, W.; Seidlová-Wuttke, D.; Gorkow, C. The Cimicifuga preparation BNO 1055 vs. conjugated estrogens in a double-blind placebo-controlled study: Effects on menopause symptoms and bone markers. Maturitas 2003, 44, S67–S77. [Google Scholar] [CrossRef] [PubMed]
  2. Osmers, R.; Friede, M.; Liske, E.; Schnitker, J.; Freudenstein, J.; Henneicke von Zepelin, H.H. Efficacy and safety of isopropanolic black cohosh extract for climacteric symptoms. Obstet. Gynecol. 2005, 105, 1074–1083. [Google Scholar] [CrossRef] [PubMed]
  3. Schellenberg, R.; Saller, R.; Hess, L.; Melzer, J.; Zimmermann, C.; Drewe, J.; Zahner, C. Dose-dependent effects of the Cimicifuga racemosa extract Ze450 in the treatment of climacteric complaints: A randomized, placebo-controlled study. Evid. Based Complement. Altern. Med. 2012, 2012, 260301. [Google Scholar] [CrossRef] [PubMed]
  4. Drewe, J.; Zimmermann, C.; Zahner, C. The effect of a Cimicifuga racemosa extracts Ze 450 in the treatment of climacteric complaints—An observational study. Phytomed. Int. J. Phytother. Phytopharm. 2013, 20, 659–666. [Google Scholar] [CrossRef] [PubMed]
  5. HMPC. European Union Herbal Monograph on Cimicifuga racemosa (L.) Nutt., Rhizoma; European Medicines Agency: Amsterdam, The Netherlands, 2018. [Google Scholar]
  6. Seidlova-Wuttke, D.; Eder, N.; Stahnke, V.; Kammann, M.; Stecher, G.; Haunschild, J.; Wessels, J.T.; Wuttke, W. Cimicifuga racemosa and its triterpene-saponins prevent the Metabolic Syndrome and deterioration of cartilage in the knee joint of ovariectomized rats by similar mechanisms. Phytomed. Int. J. Phytother. Phytopharm. 2012, 19, 846–853. [Google Scholar] [CrossRef] [PubMed]
  7. Hardie, D.G. Keeping the home fires burning: AMP-activated protein kinase. J. R. Soc. Interface 2018, 15, 20170774. [Google Scholar] [CrossRef] [PubMed]
  8. Moser, C.; Vickers, S.P.; Brammer, R.; Cheetham, S.C.; Drewe, J. Antidiabetic effects of the Cimicifuga racemosa extract Ze 450 in vitro and in vivo in ob/ob mice. Phytomed. Int. J. Phytother. Phytopharm. 2014, 21, 1382–1389. [Google Scholar] [CrossRef] [PubMed]
  9. Drewe, J.; Boonen, G.; Culmsee, C. Treat more than heat New therapeutic implications of Cimicifuga racemosa through AMPK-dependent metabolic effects. Phytomed. Int. J. Phytother. Phytopharm. 2022, 100, 154060. [Google Scholar] [CrossRef]
  10. Hardie, D.G. Regulation of AMP-activated protein kinase by natural and synthetic activators. Acta Pharm. Sin. B 2016, 6, 1–19. [Google Scholar] [CrossRef]
  11. Sharma, H.; Kumar, S. Natural AMPK Activators: An Alternative Approach for the Treatment and Management of Metabolic Syndrome. Curr. Med. Chem. 2017, 24, 1007–1047. [Google Scholar] [CrossRef]
  12. Drewe, J.; Küsters, E.; Hammann, F.; Kreuter, M.; Boss, P.; Schöning, V. Modeling Structure-Activity Relationship of AMPK Activation. Molecules 2021, 26, 6508. [Google Scholar] [CrossRef] [PubMed]
  13. Vazirian, M.; Nabavi, S.M.; Jafari, S.; Manayi, A. Natural activators of adenosine 5′-monophosphate (AMP)-activated protein kinase (AMPK) and their pharmacological activities. Food Chem. Toxicol. 2018, 122, 69–79. [Google Scholar] [CrossRef] [PubMed]
  14. Francini, F.; Schinella, G.R.; Rios, J.L. Activation of AMPK by Medicinal Plants and Natural Products: Its Role in Type 2 Diabetes Mellitus. Mini Rev. Med. Chem. 2019, 19, 880–901. [Google Scholar] [CrossRef]
  15. Anjum, J.; Mitra, S.; Das, R.; Alam, R.; Mojumder, A.; Emran, T.B.; Islam, F.; Rauf, A.; Hossain, M.J.; Aljohani, A.S.M.; et al. A renewed concept on the MAPK signaling pathway in cancers: Polyphenols as a choice of therapeutics. Pharmacol. Res. 2022, 184, 106398. [Google Scholar] [CrossRef] [PubMed]
  16. Francis, G.; Kerem, Z.; Makkar, H.P.; Becker, K. The biological action of saponins in animal systems: A review. Br. J. Nutr. 2002, 88, 587–605. [Google Scholar] [CrossRef] [PubMed]
  17. Yu, K.; Chen, F.; Li, C. Absorption, disposition, and pharmacokinetics of saponins from Chinese medicinal herbs: What do we know and what do we need to know more? Curr. Drug Metab. 2012, 13, 577–598. [Google Scholar] [CrossRef] [PubMed]
  18. Tawab, M.A.; Bahr, U.; Karas, M.; Wurglics, M.; Schubert-Zsilavecz, M. Degradation of ginsenosides in humans after oral administration. Drug Metab. Dispos. 2003, 31, 1065–1071. [Google Scholar] [CrossRef] [PubMed]
  19. Setchell, K.D.; Brown, N.M.; Desai, P.B.; Zimmer-Nechimias, L.; Wolfe, B.; Jakate, A.S.; Creutzinger, V.; Heubi, J.E. Bioavailability, disposition, and dose-response effects of soy isoflavones when consumed by healthy women at physiologically typical dietary intakes. J. Nutr. 2003, 133, 1027–1035. [Google Scholar] [CrossRef] [PubMed]
  20. Setchell, K.D.; Brown, N.M.; Zimmer-Nechemias, L.; Brashear, W.T.; Wolfe, B.E.; Kirschner, A.S.; Heubi, J.E. Evidence for lack of absorption of soy isoflavone glycosides in humans, supporting the crucial role of intestinal metabolism for bioavailability. Am. J. Clin. Nutr. 2002, 76, 447–453. [Google Scholar] [CrossRef]
  21. Li, J.X.; Yu, Z.Y. Cimicifugae rhizoma: From origins, bioactive constituents to clinical outcomes. Curr. Med. Chem. 2006, 13, 2927–2951. [Google Scholar] [CrossRef]
  22. Durant, J.L.; Leland, B.A.; Henry, D.R.; Nourse, J.G. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 2002, 42, 1273–1280. [Google Scholar] [CrossRef]
  23. Orosz, A.; Heberger, K.; Racz, A. Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets. Front. Chem. 2022, 10, 852893. [Google Scholar] [CrossRef] [PubMed]
  24. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intellig. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  25. OECD. OECD Principles for the Validation, for Regulatory Purposes, of (Quantitative) Structure-Activity Relationship Models; OECD: Paris, France, 2004; Available online: https://www.oecd.org (accessed on 4 April 2024).
  26. Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation. Mol. Inf. 2010, 29, 476–488. [Google Scholar] [CrossRef] [PubMed]
  27. van der Maaten, L.J.P.; Hinton, G.E. Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 2008, 9. [Google Scholar]
  28. Goodfellow, L.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, UK, 2016. [Google Scholar]
  29. Tolles, J.; Meurer, W.J. Logistic Regression: Relating Patient Characteristics to Outcomes. JAMA 2016, 316, 533–534. [Google Scholar] [CrossRef] [PubMed]
  30. Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
  31. Daina, A.; Michielin, O.; Zoete, V. SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci. Rep. 2017, 7, 42717. [Google Scholar] [CrossRef]
  32. Delaney, J.S. ESOL: Estimating aqueous solubility directly from molecular structure. J. Chem. Inf. Comput. Sci. 2004, 44, 1000–1005. [Google Scholar] [CrossRef]
  33. Ali, J.; Camilleri, P.; Brown, M.B.; Hutt, A.J.; Kirton, S.B. Revisiting the general solubility equation: In silico prediction of aqueous solubility incorporating the effect of topographical polar surface area. J. Chem. Inf. Model. 2012, 52, 420–428. [Google Scholar] [CrossRef]
  34. Cheng, T.; Zhao, Y.; Li, X.; Lin, F.; Xu, Y.; Zhang, X.; Li, Y.; Wang, R.; Lai, L. Computation of octanol-water partition coefficients by guiding an additive model with knowledge. J. Chem. Inf. Model. 2007, 47, 2140–2148. [Google Scholar] [CrossRef]
  35. Lipinski, C.A.; Lombardo, F.; Dominy, B.W.; Feeney, P.J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 2012, 64, 4–17. [Google Scholar] [CrossRef]
  36. Bae, I.S.; Kim, S.H. Sinapic Acid Promotes Browning of 3T3-L1 Adipocytes via p38 MAPK/CREB Pathway. Biomed. Res. Int. 2020, 2020, 5753623. [Google Scholar] [CrossRef]
  37. Singh, S.S.B.; Patil, K.N. trans-ferulic acid attenuates hyperglycemia-induced oxidative stress and modulates glucose metabolism by activating AMPK signaling pathway in vitro. J. Food Biochem. 2022, 46, e14038. [Google Scholar] [CrossRef]
  38. Lin, B.; Wan, H.; Yang, J.; Yu, L.; Zhou, H.; Wan, H. Lipid regulation of protocatechualdehyde and hydroxysafflor yellow A via AMPK/SREBP2/PCSK9/LDLR signaling pathway in hyperlipidemic zebrafish. Heliyon 2024, 10, e24908. [Google Scholar] [CrossRef]
  39. Dong, Y.Z.; Li, L.; Espe, M.; Lu, K.L.; Rahimnejad, S. Hydroxytyrosol Attenuates Hepatic Fat Accumulation via Activating Mitochondrial Biogenesis and Autophagy through the AMPK Pathway. J. Agric. Food Chem. 2020, 68, 9377–9386. [Google Scholar] [CrossRef]
  40. Navarro del Hierro, J.; Herrera, T.; Fornari, T.; Reglero, G. The gastrointestinal behavior of saponins and its significance for their bioavailability and bioactivities. J. Funct. Food 2018, 40, 484–497. [Google Scholar] [CrossRef]
  41. Disch, L.; Forsch, K.; Siewert, B.; Drewe, J.; Fricker, G. In vitro and in situ characterization of triterpene glycosides from Cimicifuga racemosa extract. J. Pharm. Sci. 2017, 106, 3642–3650. [Google Scholar] [CrossRef]
  42. van Breemen, R.B.; Liang, W.; Banuvar, S.; Shulman, L.P.; Pang, Y.; Tao, Y.; Nikolic, D.; Krock, K.M.; Fabricant, D.S.; Chen, S.N.; et al. Pharmacokinetics of 23-epi-26-deoxyactein in women after oral administration of a standardized extract of black cohosh. Clin. Pharmacol. Ther. 2010, 87, 219–225. [Google Scholar] [CrossRef]
  43. Ramesh, M.; Vepuri, S.B.; Oosthuizen, F.; Soliman, M.E. Adenosine Monophosphate-Activated Protein Kinase (AMPK) as a Diverse Therapeutic Target: A Computational Perspective. Appl. Biochem. Biotechnol. 2016, 178, 810–830. [Google Scholar] [CrossRef]
  44. Li, Y.; Peng, J.; Li, P.; Du, H.; Li, Y.; Liu, X.; Zhang, L.; Wang, L.L.; Zuo, Z. Identification of potential AMPK activator by pharmacophore modeling, molecular docking and QSAR study. Comput. Biol. Chem. 2019, 79, 165–176. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Flow and structure of experiments.
Figure 1. Flow and structure of experiments.
Pharmaceutics 16 00511 g001
Figure 2. tSNE analysis: AMPK activators and controls. MACCS fingerprint descriptors (N = 2242, perplexity = 100, number of iterations = 5000).
Figure 2. tSNE analysis: AMPK activators and controls. MACCS fingerprint descriptors (N = 2242, perplexity = 100, number of iterations = 5000).
Pharmaceutics 16 00511 g002
Figure 3. Distribution of four important physicochemical parameters between activators and controls: (A) Molecular Weight, (B) Total Polar Surface Area (TPSA), (C) Number of Rings in the Molecules, and (D) Predicted Octanol/Water Partition Coefficients (XLogP). Significant differences in the distributions of these parameters were observed (Mann–Whitney test): Activators had lower molecular weights (p < 0.0001), higher lipophilicity (median XLogP 2.4 for activators vs. 1.5 for controls; p < 0.0001), lower Total Polar Surface Area (TPSA) (p < 0.009), and fewer rings in the molecules, on average (3.6 for activators vs. 4.1 for controls; p < 0.0001), than controls.
Figure 3. Distribution of four important physicochemical parameters between activators and controls: (A) Molecular Weight, (B) Total Polar Surface Area (TPSA), (C) Number of Rings in the Molecules, and (D) Predicted Octanol/Water Partition Coefficients (XLogP). Significant differences in the distributions of these parameters were observed (Mann–Whitney test): Activators had lower molecular weights (p < 0.0001), higher lipophilicity (median XLogP 2.4 for activators vs. 1.5 for controls; p < 0.0001), lower Total Polar Surface Area (TPSA) (p < 0.009), and fewer rings in the molecules, on average (3.6 for activators vs. 4.1 for controls; p < 0.0001), than controls.
Pharmaceutics 16 00511 g003
Figure 4. (ROC) analysis coupled with 10-fold cross-validation: (A) Deep Neural Network (DNN); (B) Logistic Regression Classification (LRC); and (C) Random Forest Classification (RFC).
Figure 4. (ROC) analysis coupled with 10-fold cross-validation: (A) Deep Neural Network (DNN); (B) Logistic Regression Classification (LRC); and (C) Random Forest Classification (RFC).
Pharmaceutics 16 00511 g004
Figure 5. Classification of Cimicifuga racemosa root constituents was performed using three different methods: (A) Deep Neural Network (DNN); (B) Logistic Regression Classification (LRC); and (C) Random Forest Classification (RFC). Saponins and their aglycones are clearly classified as activators of AMPK. Saponins and their aglycones were unequivocally identified as activators of AMPK. While other constituents were also categorized similarly, albeit with lower probabilities. Among these constituents, cyclocimipronidine and dopargine were classified with uncertainty, along with N-methylcytisine, which the DNN model classified as inactive.
Figure 5. Classification of Cimicifuga racemosa root constituents was performed using three different methods: (A) Deep Neural Network (DNN); (B) Logistic Regression Classification (LRC); and (C) Random Forest Classification (RFC). Saponins and their aglycones are clearly classified as activators of AMPK. Saponins and their aglycones were unequivocally identified as activators of AMPK. While other constituents were also categorized similarly, albeit with lower probabilities. Among these constituents, cyclocimipronidine and dopargine were classified with uncertainty, along with N-methylcytisine, which the DNN model classified as inactive.
Pharmaceutics 16 00511 g005
Figure 6. Comparison of triterpene saponin constituents with their theoretically derived aglycones (applied from open source SwissAMDE Webtool, [31]): (A) molecular weight of aglycones was significantly smaller than that of saponins (p < 0.0001, paired two-sided t-test); (B) water solubility surprisingly showed high overlap but was significantly smaller (p = 0.02); (C) topological polar surface area (TPSA) was clearly significantly smaller in the aglycones (p < 0.0001; paired two-sided t-test); (D) lipophilicity, as expressed by XLogP, increased significantly (p < 0.0001; paired two-sided t-test); (E) Lipinski’s rule of five violations was significantly differently distributed (p = 0.01; Wilcoxon signed-rank test), with aglycones having a smaller number of violations; and (F) estimation of the lead-likeness score was not significantly different.
Figure 6. Comparison of triterpene saponin constituents with their theoretically derived aglycones (applied from open source SwissAMDE Webtool, [31]): (A) molecular weight of aglycones was significantly smaller than that of saponins (p < 0.0001, paired two-sided t-test); (B) water solubility surprisingly showed high overlap but was significantly smaller (p = 0.02); (C) topological polar surface area (TPSA) was clearly significantly smaller in the aglycones (p < 0.0001; paired two-sided t-test); (D) lipophilicity, as expressed by XLogP, increased significantly (p < 0.0001; paired two-sided t-test); (E) Lipinski’s rule of five violations was significantly differently distributed (p = 0.01; Wilcoxon signed-rank test), with aglycones having a smaller number of violations; and (F) estimation of the lead-likeness score was not significantly different.
Pharmaceutics 16 00511 g006
Table 1. Summary of results of classification of different machine learning methods.
Table 1. Summary of results of classification of different machine learning methods.
MethodTraining Accuracy (%)Test
Accuracy
(%)
Y-Randomization (**)Precision (%)Sensitivity (%)Specificity
(%)
ROC AUC (*)TNFNFPTP
Deep
Neural
Network
(DNN)
96.986.257.6 ± 1.889.886.086.597.6 ± 4.2503047
Logistic
Regression Classification
(LRC)
90.290.257.7 ± 1.587.984.584.190.2 ± 4.24310740
Random Forest Classification
(RFC)
99.789.057.8 ± 1.893.387.291.495.0 ± 2.5498142
Dataset (number): activators (1120), controls (815, after SMOTE oversampling 1122). (*) ROC AUC = area under the receiver operating characteristics curve. (**) N = 50 permutations, TN = number of correctly classified controls, FN = number of falsely classified positive controls, FP = number of falsely classified negative controls, and TP = number of correctly classified positive controls.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Drewe, J.; Schöning, V.; Danton, O.; Schenk, A.; Boonen, G. Machine Learning-Based Analysis Reveals Triterpene Saponins and Their Aglycones in Cimicifuga racemosa as Critical Mediators of AMPK Activation. Pharmaceutics 2024, 16, 511. https://doi.org/10.3390/pharmaceutics16040511

AMA Style

Drewe J, Schöning V, Danton O, Schenk A, Boonen G. Machine Learning-Based Analysis Reveals Triterpene Saponins and Their Aglycones in Cimicifuga racemosa as Critical Mediators of AMPK Activation. Pharmaceutics. 2024; 16(4):511. https://doi.org/10.3390/pharmaceutics16040511

Chicago/Turabian Style

Drewe, Jürgen, Verena Schöning, Ombeline Danton, Alexander Schenk, and Georg Boonen. 2024. "Machine Learning-Based Analysis Reveals Triterpene Saponins and Their Aglycones in Cimicifuga racemosa as Critical Mediators of AMPK Activation" Pharmaceutics 16, no. 4: 511. https://doi.org/10.3390/pharmaceutics16040511

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop