**Computer-Driven Development of an in Silico Tool for Finding Selective Histone Deacetylase 1 Inhibitors**

**Hajar Sirous 1,\* , Giuseppe Campiani <sup>2</sup> , Simone Brogi 3,\* , Vincenzo Calderone <sup>3</sup> and Giulia Chemi 2,**†


Academic Editors: Marco Tutone and Anna Maria Almerico Received: 5 April 2020; Accepted: 20 April 2020; Published: 22 April 2020

**Abstract:** Histone deacetylases (HDACs) are a class of epigenetic modulators overexpressed in numerous types of cancers. Consequently, HDAC inhibitors (HDACIs) have emerged as promising antineoplastic agents. Unfortunately, the most developed HDACIs suffer from poor selectivity towards a specific isoform, limiting their clinical applicability. Among the isoforms, HDAC1 represents a crucial target for designing selective HDACIs, being aberrantly expressed in several malignancies. Accordingly, the development of a predictive in silico tool employing a large set of HDACIs (aminophenylbenzamide derivatives) is herein presented for the first time. Software Phase was used to derive a 3D-QSAR model, employing as alignment rule a common-features pharmacophore built on 20 highly active/selective HDAC1 inhibitors. The 3D-QSAR model was generated using 370 benzamide-based HDACIs, which yielded an excellent correlation coefficient value (R<sup>2</sup> = 0.958) and a satisfactory predictive power (Q<sup>2</sup> = 0.822; Q<sup>2</sup> F3 = 0.894). The model was validated (r<sup>2</sup> ext\_ts = 0.794) using an external test set (113 compounds not used for generating the model), and by employing a decoys set and the receiver-operating characteristic (ROC) curve analysis, evaluating the Güner–Henry score (GH) and the enrichment factor (EF). The results confirmed a satisfactory predictive power of the 3D-QSAR model. This latter represents a useful filtering tool for screening large chemical databases, finding novel derivatives with improved HDAC1 inhibitory activity.

**Keywords:** 3D-QSAR; pharmacophore modeling; ligand-based model; HDACs; isoform-selective histone deacetylase inhibitors; aminophenylbenzamide

#### **1. Introduction**

Epigenetic defects in gene expression are well known in the onset and progression of cancer. In this context, pharmacological targeting of proteins of the cellular epigenetic machinery has provided opportunities for anti-cancer drug design [1,2]. Among epigenetic enzymes, histone deacetylases (HDACs) hold a fundamental role in regulating gene expression through histone post-translational modifications [3–5].

HDACs catalyze the removal of acetyl groups from the acetylated ε-amino termini of lysine residues located at the tails of the nucleosomal histones core. Histone deacetylation process leads to condensed chromatin structure which concomitantly restricts the accessibility of related transcriptional factors to their target genes, thereby suppressing gene expression including tumor suppressor genes [6–8]. The abnormal regulation of this process culminates with the high expression level of HDACs. This event has been observed in the development of several human cancers. Consequently, effective inhibition of HDACs has recently gained importance as a valid therapeutic strategy to reverse aberrant epigenetic changes associated with cancer [9–11]. HDAC inhibitors (HDACIs) induce histone hyperacetylation and subsequent transcriptional re-activation of suppressed genes which are correlated with a variety of effects on tumor cells including apoptosis, differentiation, cell cycle arrest, inhibition of proliferation and cytostasis [12,13].

ε

The HDAC family comprises 18 isoforms in mammalian cells which are categorized into four main classes (class I-IV) based on their structural and functional characteristics. HDACs belonging to class I (HDAC1–3 and 8), II (HDACs 4–7, 9 and 10) and IV (HDAC11) are all zinc-dependent metalloenzymes, while class III HDACs, also known as the sirtuins (SIRT1-7), requires NAD<sup>+</sup> as a cofactor for their catalytic function [6,14].

Extensive efforts over recent decades have led to the identification of four chemically diverse classes of HDACIs as potent antineoplastic agents including, hydroxamates, benzamides, cyclic peptides, and short-chain fatty acids [3]. The main breakthrough in developing these inhibitors was achieved by the US FDA approval of Vorinostat [15], Belinostat [16], Panobinostat (hydroxamate-based inhibitors) [17], Romidepsin (a cyclic peptide) [18] and Chidamide (a benzamide-based inhibitor) [19] for the treatment of lymphoma and myeloma. Moreover, several HDACIs such as Mocetinostat [20], Entinostat [21], Tacedinaline [22], Givinostat [23], and Abexinostat [24] are currently in clinical trials for treating various types of cancers. The structures of several approved and clinical HDACIs are shown in Figure 1.

**Figure 1.** HDACIs approved by FDA and/or in clinical trials.

Despite these successes, the most known HDACIs target multiple HDAC isoforms and this poor selectivity represents a major drawback which limits their broad clinical utility [25,26]. Isoform-selective HDACIs would offer superior therapeutic advantages due to limited off-target and undesirable effects, improved clinical efficacy and better tolerability [27,28]. Moreover, isoform-selective inhibitors would provide chemical tools to delineate the precise roles of individual HDAC isoforms in human diseases, including rare disorders [29]. Therefore, in recent years, identification of highly potent inhibitors with strict selectivity towards a specific isoform have caught more attention in the development of novel HDACIs for the epigenetic therapy [30–33]. In this context, due to the pivotal role of HDAC1 in the angiogenesis, proliferation, and survival of mammalian carcinoma cells, this isoform is particularly being sought as a preferred target for successful design of selective HDACIs [34–38].

A wide range of hydroxamic acid derivatives are known to be potent pan-inhibitors of several HDAC isoforms [39,40]. Accordingly, in recent years, there has been considerable interest in developing non-hydroxamate HDACIs with satisfactory selectivity towards a specific isoform. In this context, aminophenylbenzamide derivatives represent an important class of non-hydroxamate HDACIs owing to their potent HDAC inhibition along with desirable pharmacokinetic profile and excellent selectivity for HDAC class I enzyme. Furthermore, inhibitors containing an ortho-aminobenzamide typically exhibit relatively greater levels of selectivity for class I HDACs, particularly HDAC1 [35,41]. In addition, hydroxamic-derived inhibitors often suffer from some serious pharmacokinetic issues including poor metabolic stability, rapid clearance, undesirable oral absorption, and short half-life in plasma, whereas benzamides-based inhibitors show better metabolic stability and oral bioavailability [42–45]. For these solid reasons, renewed efforts are being directed towards the further exploration of innovative aminophenylbenzamide chemotypes as privileged and valuable scaffolds to develop isoform-selective HDACIs [46–48].

Recently, in silico techniques, including ligand-based methods such as pharmacophore modeling and three-dimensional quantitative structural activity relationship (3D-QSAR), have efficiently contributed to guide the discovery of novel bioactive molecules, with reduced costs in terms of money and time [49–51]. In fact, QSAR methods provide relationships between physicochemical properties of a series of compounds and their biological activities to obtain a reliable statistical model for predicting the activities of new chemical entities. The fundamental principle of the technique is that the change in structural properties determines modifications in biological activities of the compounds. In the classical QSAR approaches, affinities of ligands to their binding sites, inhibition constants, rate constants, and other biological data have been correlated with molecular properties including lipophilicity, polarizability, electronic and steric properties (Hansch analysis) or with structural features (Free-Wilson analysis). However, classical QSAR approach has only a limited utility for designing new molecules due to the lack of consideration of the 3D structure of the selected compounds. Accordingly, 3D-QSAR has emerged as a natural extension to the classical Hansch and Free-Wilson approaches, which exploits the three-dimensional properties of the ligands to predict their biological activities employing robust chemometric techniques such as partial least squares (PLS). The success of these methods can be attributed to several factors including identification of important features for the activity, rationalization of activity trends in molecules under study, prediction of the specific activity for a selected target or undesirable effects of new compounds. On the other hand, ligand-based methods have been used in virtual screening campaigns of chemical databases to find novel hits with improved potency and can be combined with other computational and experimental workflows to discover new potential drugs [52–57].

Until now, only limited number of 3D-QSAR studies for hydroxamate set of HDACIs have been reported [58–64]. However, to the best of our knowledge, no previous attempt has been made to seek the structural and chemical features of aminophenylbenzamide governing their HDAC inhibitory activities employing 3D-QSAR methodology along with a large set of compounds. Given the aforementioned therapeutic significance of this class of inhibitors, we developed and validated a 3D-QSAR model using a comprehensive set of previously reported benzamide derivatives as selective HDAC1 inhibitors. From this perspective, the software Phase, implemented in Maestro, was employed to explore a common-features pharmacophore hypothesis based on highly active ligands. This hypothesis was

then used as an alignment rule to derive a predictive 3D-QSAR model [65]. Such an in silico tool could aid not only in forecasting the HDAC1 inhibitory activity of newly designed chemical entities, but also offer a robust foundation for designing new selective HDACIs with increased binding affinities to HDAC1. Accordingly, the developed model could have a relevant implication in drug discovery campaign for searching isoform-selective HDACIs.

#### **2. Results and Discussion**

The application of 3D-QSAR methodology in the design of HDACIs has received little attention to date and only a few instances of field-based QSAR models (comparative molecular field analysis (CoMFA) and comparative similarity indices analysis (CoMSIA) methods) have been reported for hydroxamate-based inhibitors [58–64]. However, this approach has not been carried out for a large set of benzamide-based derivatives behaving as HDACIs. On the other hand, we have recently developed a series of predictive 3D-QSAR models for different purposes including the identification or rational design of new chemical entities for different targets [53,55], and the prediction of undesirable effects of novel molecules such as potential *h*ERG K<sup>+</sup> channel related cardiotoxicity [57]. In all these cases, Phase was used to develop a computational tool using a pharmacophore-based alignment that links the information of pivotal functional groups of the ligands with their biological activity [65]. The fruitful results of the above-mentioned molecular modeling studies as well as therapeutic significance of benzamide chemotypes as valuable isoform-selective HDACIs inspired us to derive a pharmacophore-based 3D-QSAR model to be used as a screening filtering tool able to quantitatively predict the HDAC1 inhibitory activity of newly designed ligands.

#### *2.1. Data Set Preparation*

A comprehensive data set of 370 diverse HDACIs based on benzamide scaffold with functional biological activity expressed as IC<sup>50</sup> (see the Supplementary Materials for further details) with a range of HDAC1 inhibitory activities spanning five orders of magnitude (from 6.0 nM of compound **17** to 50 µM of compound **132**, Table S1) were selected from literature for developing a predictive 3D-QSAR model. Subsequently, an extensive conformational search for each ligand was performed employing MacroModel software (see experimental section for further details). Conformational analysis is crucial to enhance both the quality of the alignment for the molecules used to generate the 3D-QSAR model and the reliability of the in silico tool [53–57]. After the exhaustive conformational analysis of the selected ligands (Table S1), the generation of the 3D-QSAR model was started.

#### *2.2. Pharmacophore Modeling and 3D-QSAR Model Generation*

As a first step to develop the 3D-QSAR model, 20 most active compounds with IC<sup>50</sup> values ≤ 10 nM included in the data set (ligands **1**–**20**, Figure 2, Table S1) were considered to find out common pharmacophore hypotheses that were subsequently scored and ranked by the software Phase. This means that the highly active compounds possess common features that are responsible for the activity exploited by a 3D pharmacophore hypothesis. Therefore, a pharmacophore hypothesis provides a rational picture of primary chemical features of ligands responsible for HDAC1 inhibitory activity and therefore can be used as a reliable alignment rule for the 3D-QSAR model development.

**Figure 2.** Chemical structure of highly active compounds against HDAC1 (IC<sup>50</sup> comprised between 0.004 µM and 0.01 µM) used for generating a common-features pharmacophore.

To have optimal combination of sites or features shared by the most active ligands, the minimum and maximum number of site points were set on 5. This means that we selected 5 as maximum features to include in the pharmacophore models. Among the 26 common pharmacophore hypotheses generated by the software Phase, only those models which showed superior alignment with the active compounds were identified by calculating the survival score. The survival scoring function of Phase module identifies the best candidate hypothesis from the generated models and offers an overall ranking of all the hypotheses. The scoring algorithm includes contributions from the alignment of site points and vectors, volume overlap, selectivity, number of ligands matched, relative conformational energy, and activity. To identify pharmacophore models with more active and less inactive features, all models were mapped to inactive compounds and scored. If inactives score well, the hypothesis could be invalid because it does not discriminate between actives and inactives. Therefore, adjusted survival score was calculated by subtracting the inactive score from the survival score of these pharmacophores.

After the scoring, the model ADDRR, herein referred to ADDRR hypothesis, with the maximum adjusted survival score (3.769) and lowest relative conformational energy, was selected as the top-ranked hypothesis among the generated 3D model hypotheses. The different scoring parameters for the selected hypothesis (ADDRR) were provided in Table 1. The 3D spatial arrangement of all features with inter-feature distance constraints of ADDRR are presented in Figure 3. As shown in this figure, the hypothesis was characterized by the five main features: one hydrogen-bond acceptor (A), two hydrogen-bond donors (D), and two aromatic rings (R).

**Table 1.** The different scoring parameters for the best pharmacophore hypothesis, matching all 20 highly active compounds used.

**Figure 3.** (**A**) Superposition of highly active compound **13** and ADDRR hypothesis. (**B**) ADDRR hypothesis and its inter-feature distances. Features are as follows: H-bond acceptor = A, red vector; H-bond donor = D, blue vectors; aromatic feature = R, orange ring (pictures were generated by means of Maestro software).

Figure 3A depicts one of the most active ligands in the set (compound **13**, Table S1), mapped onto the ADDRR pharmacophore. As depicted in the mentioned figure, compound **13** thoroughly fits all features of the pharmacophore model, underlining the previous findings on structural components required for interacting with the HDAC1 binding site [66–68]. As illustrated in Figure 3, the carbonyl oxygen of the benzamide group served as a hydrogen-bond acceptor (HBA) feature, while two hydrogen-bond donor (HBD) features were mapped to the protons of the 2-aminophenyl NH<sup>2</sup> and amide NH. Furthermore, out of two aromatic features, one was mapped to the phenyl ring of the 2-aminophenyl. The other aromatic feature was mapped to the pyridine ring of the nicotinamide moiety. This hypothesis was well corroborated by accepted common pharmacophore model for HDACIs comprising the zinc binding group (ZBG), a linker and a cap group as established by computational and biophysical studies reported for HDAC1 inhibitors [66–70]. Based on the current study, the aniline groups of the benzamide-based inhibitors are served as ZBG, coordinating the catalytic zinc ion in the HDAC1 active site. Moreover, it is well known that H-bonds formation with key residues of HDAC1 active site are commonly found for the ZBG of this class of HDAC1 inhibitors. In particular, in addition to the zinc ion coordination, protons of the 2-aminophenyl NH<sup>2</sup> group could also establish hydrogen bonds with His140 and His141, while the carbonyl oxygen of the benzamide portion could also form another hydrogen binding interaction with hydroxyl group of

Tyr303. It has been reported that NH of amide could offer the appropriate HBD vector to address the Gly149 through H-bond formation. The presence of two aromatic features capable of participating in π-π stacking interactions with hydrophobic residues Phe150, Tyr204, Phe205, and Tyr303, represents another important requisite for further stabilization of ligand binding. These residues were located in a long and narrow hydrophobic tube-like channel and thus interaction with them allow tubular access of ligand into active site [66,67,69,70]. Accordingly, the above-mentioned ADDRR hypothesis imparts the key features of ligand for providing the relevant interactions with the HDAC1 active site.

The ADDRR pharmacophore hypothesis was then employed as alignment rule to derive the 3D-QSAR model. In this step, the compounds were randomly divided into training (70%) and test sets (30%) taking into account that the response range was well-covered in both sets (Table S1). This choice was made to warrant the inclusion of the positive information originating from 70% of the compounds enclosed in the training set (corresponding to 259 compounds), for the development of the computational tool. Moreover, the compounds kept in the test set (30%, 111 compounds) ensures an appropriate assessment of the predictive power of the generated model through an exhaustive internal validation. The atom-based version of Phase's 3D-QSAR workflow was preferred to the pharmacophore-based one. Such a choice allowed us to take into account contributions associated with all the important structural features other than pharmacophore for HDAC1 inhibitory activity such as the steric clashes. To enhance the model accuracy and evade overfitting phenomenon, models containing one up to seven factors were generated for the studied data set. Statistical parameters for each model are provided in Table 2. Model featuring seven factors was preferred and selected because it better performed in comparison with other models. The reliability of the selected model is justified by the fact that all statistical parameters were in acceptable range. In this regard, the correlation and cross-validated correlation coefficients (R<sup>2</sup> = 0.958 and Q<sup>2</sup> = 0.822, respectively) of the selected model along with the Pearson R-value (R-Pearson = 0.915) were extremely satisfactory, indicating a close correspondence between estimated and experimental IC<sup>50</sup> values. Moreover, the high Fisher ratio (F = 822.1) suggested a statistically significant regression model, which was further supported by the small value of the variance ratio (P = 4.377 × 10−169), an indication of a high degree of confidence. Finally, the small values of the standard deviation and the root-mean-square error (SD: 0.178 and RMSE: 0.281, respectively) also provided indication about the robustness of the developed computational model. Moreover, the Q<sup>2</sup> F3 value clearly indicates that the 3D-QSAR model with seven factors is robust.

**Table 2.** 3D-QSAR statistical parameters of the seven Latent Variables (LVs) Phase-derived sets of models.


aR 2 : value of r<sup>2</sup> of the regression. <sup>b</sup>SD: standard deviation of the regression. <sup>c</sup>F: variance ratio. <sup>d</sup>P: significance level of variance ratio. <sup>e</sup>RMSE: root-mean-square error in the test set predictions. <sup>f</sup>Q<sup>2</sup> : value of Q<sup>2</sup> for the predicted activities. <sup>g</sup>Q<sup>2</sup> F3: value of Q<sup>2</sup> F3 : for the predicted activities calculated as reported in Materials and Methods section. <sup>h</sup>R-Pearson: correlation between the predicted and observed selectivity index values for the test set.

A scatter plot of experimental versus predicted activities was generated to assess the results (Figure 4). Based on this plot, the IC<sup>50</sup> values were reliably predicted for both training and test set molecules (Table S1). This plot along with the aforementioned statistical features clearly imply the significance of the approach and indicate a QSAR model with a robust predictive power.

**Figure 4.** Scatter plot for the predicted (Phase-predicted activity) and the observed (experimental activity) pIC<sup>50</sup> values (µM) as calculated by the 3D-QSAR model applied to the training set (blue) and test set (cyan) compounds.

− Three-dimensional aspects obtained from the QSAR model were visualized using 3D plots of the crucial volume elements occupied by ligands. Such plots allow the visual analysis of important features of ligand structures along with their contributions to the biological activity. The 3D plot representation of the whole model, superimposed to the highly (**3, 10**, and **13**), moderate (**22, 213,** and **239**), and less active derivatives (**218, 279**, and **299**), is depicted in Figure 5. In this illustration, the blue and red cubes indicate the positive and negative coefficients, respectively. In fact, blue cubes refer to ligand regions in which the specific feature is important for better activity, whereas the red cubes are indicative of a particular structural feature or functional group which is not essential for the activity or is likely to decrease the activity. Cubes with small positive and negative coefficients, which therefore did not greatly affect the activity, were filtered out by setting a 1.50 × 10−<sup>2</sup> coefficient threshold. Remarkably, compounds **10** and **13** (Figure 5A,B, respectively) as well as other highly active ligands, mainly lodge in the blue regions, while the less active compounds such as **218** and **299** (Figure 5G,I, respectively) largely resides on the red regions. Moreover, regarding some compounds with moderate activity and generally all compounds with limited activity, we also observed a significant inability to match all the pharmacophore features, according to the decrease of inhibitory potencies.

**Figure 5.** (**A**–**C**) Superposition of highly active compounds **3**, **10**, and **13**, respectively with the 3D-QSAR model. (**D**–**F**) Superposition of moderate active compounds **22** (Mocetinostat), **213**, and **239**, respectively with the 3D-QSAR model. (**G**–**I**) Superposition of less active compounds **218**, **279**, and **299**, respectively with the 3D-QSAR model. The picture was generated by means of Maestro software (Schrödinger, LLC, New York, NY, USA, 2015).

#### *2.3. In Silico 3D-QSAR Model Validation*

#### 2.3.1. Validation Using External Test Set

After the generation of the 3D-QSAR model, a preliminary in silico validation was performed using an external test set selected from the literature that have not been used for generating the computational model. This set was composed of 113 compounds with different inhibitory activities against HDAC1 (ranging from 5.8 nM to 1140 nM; Table S2 in the Supplementary Materials). As reported in Table S2, our model was satisfactorily efficient in estimating the HDAC1 inhibitory activity of compounds included in the external test set. In the scatter plot depicted in Figure 6, the experimental and predicted pIC<sup>50</sup> values of these compounds are also displayed, offering a reasonable correlation coefficient (r2 ext\_ts = 0.794). This result provided further confirmation that the correlation shown by the model is not accidental.

**Figure 6.** Scatter plot for the predicted (Phase-predicted activity) and observed (experimental activity) pIC<sup>50</sup> values (µM) as calculated by the 3D-QSAR model with 7 factors applied to the external test set.

#### 2.3.2. Validation Using Decoy Set and Receiver-Operating Characteristic (ROC) Curve Approach

For a further validation and to assess the performance of the developed 3D-QSAR model, we employed a validation method based on generation of decoys set. This procedure is usually employed to evaluate the capability of in silico tools such as 3D-QSAR models to discriminate between active or inactive molecules [71–75]. Starting from highly active compounds (ligands 1–20 in Table S1), 86 additional compounds with good activity against HDAC1 (cutoff IC<sup>50</sup> < 35 nM; Tables S1 and S2) were selected from the training, test and external validation sets for a total of 106 compounds (Table S3) from which decoys were generated. For this set of active ligands, DUD-E server generated 5764 decoys. After an appropriate minimization and conformational search of decoys, we have combined them with the active molecules (referred as A in Figure 7A) for a total of 5870 compounds (referred as D in Figure 7A) that were then subjected to a virtual screening using the developed 3D-QSAR model. Interestingly, the results of this evaluation supported the validity of the proposed model. Analysis of the database screening results (Figure 7A) indicated a trend in which inactive compounds fail to completely satisfy all the pharmacophore features, thus making their predicted activity very poor or absent. In contrast, the 3D-QSAR model was reasonably efficient in the estimation of HDAC1 inhibitory activity of active compounds.


**Figure 7.** (**A**) EF and GH scores obtained by the application of 3D-QSAR model in a database screening; and (**B**) receiver-operating characteristic (ROC) curve generated from database screening.

According to the screening results (Figure 7A), the top 30 ranked compounds were considered to be hits (Ht). This cutoff value could represent a suitable number of molecules (about 1% of database) to be purchased after a virtual screening campaign. Remarkably, among Ht, 26 (Ha) compounds belonged to the set of 106 known HDAC1 inhibitors. Furthermore, this qualitative analysis was well supported by the calculation of some statistical parameters such as EF and GH score (see Materials and Methods section for calculation details). In this regard, the calculated EF was 48.33, which implies that it could be about 48.33 times more probable to select active compounds from the hit list compared with random selection from the complete database. The estimated GH score value of 0.71, larger than 0.5, indicates a great reliability of the model (Figure 7A). This suggests that the developed computational model can serve as efficient tool in virtual screening studies to find out novel chemical entities behaving as selective HDAC1 inhibitors.

The applicability of the proposed 3D-QSAR model was further evaluated by means of the receiver-operating characteristic (ROC) curve. The ROC curve approach is a well-recognized metric used as an objective way to assess the balance between model sensitivity (capability to discover true positives) and specificity (capability to avoid false positives) [55,57,76,77]. For this purpose, 5870 compounds employed in the previous validation step, were ranked according to their predicted activity values as estimated by the 3D-QSAR model. The output of the ROC curve provided a score for appraising the overall performance of the model. In particular, the closer the ROC score is to 1.0, the

better is the model at discriminating active from inactive compounds. ROC curve analysis of our in silico model yielded a satisfactory Area Under the Curve (AUC) score of 0.94 (Figure 7B), providing additional evidence about the predictivity of the developed 3D-QSAR model.

#### **3. Materials and Methods**

#### *3.1. Hardware and Software Specifications*

All computational tasks in this study were carried out using molecular modeling package from Schrödinger suite 2015 (Schrödinger, Inc., LLC, New York, NY, USA) installed on an Intel(R) Xeon(R) CPU E5-2620 v2 @ 3.30 GHz, 64 GB RAM with 12 processors, and a 2GB graphics card of NVIDIA Quadro K2200 running Ubuntu 10.04 LTS (long-term support) as operating system. Access to the Schrödinger modules as well as the capability to organize and analyze data was provided by Maestro as a portal interface of Schrödinger [78].

#### *3.2. Ligands and Data Set Preparation*

A comprehensive set of HDAC1 inhibitors characterized by the 2-aminophenylbenzamide scaffold with known IC<sup>50</sup> values that vary over a wide range was collected from the literature [66,67,79–92] and the bindingDB database [93]. The selection criterion for the compounds to be included in the set was that their HDAC1 inhibition was evaluated using the same fluorescent assay based on the fluorogenic substrate Fluor-de-Lys. This inclusion criterion allowed us to obtain a homogeneous set of compounds regarding their biological evaluation. This step is crucial to develop a predictive model since the data selection is pivotal for adding the correct information to a software for developing computational models. The 3D structures of all ligands were built using the builder panel in the Maestro. For the molecules possessing known stereochemistry, the absolute configuration was specified during the drawing of the compounds. All structures were treated by LigPrep module of Schrödinger suite 2015 [94] in order to generate the most probable ionization state at the cellular pH value (7.4 ± 0.2) as reported by us [95–97]. Moreover, the OPLS-AA\_2005 force field was used for optimization, which produces the lowest energy conformer of the ligand [98]. The prepared ligands were then submitted to MacroModel software [99] in order to obtain an exhaustive conformational analysis using the OPLS-AA\_2005 as force field. The solvent effects are simulated employing the analytical Generalized-Born/Surface-Area (GB/SA) model [100], and no cutoff for non-bonded interactions was selected. Molecular energy minimizations were performed using Polak–Ribiere conjugate gradient (PRCG) method with 2000 maximum iterations and 0.001 gradient convergence threshold. The conformational searches were carried out by employing MCMM (Monte Carlo Multiple Minimum) torsional sampling method. Automatic setup with 21 kJ/mol (5.02 kcal/mol) in the energy window for saving structure and a 0.5 Å cutoff distance for redundant conformers was used.

#### *3.3. 3D-QSAR Model Generation*

The software package Phase 4.2 [101], implemented in Maestro suite, was used to generate pharmacophore hypotheses and 3D-QSAR models for HDAC1 inhibitors based on 2-aminophenylbenzamide scaffold. Given a set of molecules with high affinity for a particular protein target, this software uses fine-grained conformational sampling and a range of scoring techniques to identify a common-features pharmacophore hypothesis, which conveys 3D structural characteristics that are critical for the activity. Pharmacophore feature sites for the molecules were specified by a set of features well-defined in Phase as hydrogen-bond acceptor (A), hydrogen-bond donor (D), hydrophobic group (H), negatively charged group (N), positively charged group (P) and aromatic ring (R). No user-defined feature was employed for the present study. The ligands prepared as reported in the previous step, were imported into the "develop common pharmacophore hypotheses" panel of Phase with their respective biological activity values. Twenty active compounds (Figure 2 and in Table S1 in the Supplementary Materials) possessing highly inhibitory potency against HDAC1 were

selected for generating the pharmacophore hypotheses. Common-features pharmacophore hypotheses were identified, scored, and ranked by means of conformational analysis and tree-based partitioning techniques. In the score hypotheses step, common pharmacophores are examined, and a scoring procedure is applied to identify the pharmacophore from each surviving n-dimensional box that yields the best alignment of the active set ligands. This pharmacophore provides the means to explain how the active molecules bind to the receptor.

The best ranked pharmacophore model obtained by Phase (ADDRR, shown in Figure 3A superimposed to aminophenylnicotinamide analogue **13**), consisted of five features: one hydrogen-bond acceptor (A), two hydrogen-bond donors (D), and two aromatic functions (R). The inter-feature distances (Figure 3B) were measured by using the site measurements tool implemented in the software Phase. This pharmacophore was used as alignment rule for further 3D-QSAR analysis. All the molecules used for the QSAR studies (Table S1) were aligned to the selected pharmacophore hypothesis. In the present study, we set a pIC<sup>50</sup> threshold for the selection of active and inactive ligands. These pIC<sup>50</sup> values were also used as the dependent variable in the 3D-QSAR calculations. In particular, compounds that showed an IC<sup>50</sup> comprised between 5 and 50 µM were considered to be inactive ligands. Moderate inhibitors were considered compounds with IC<sup>50</sup> between 10 nM and 5 µM, while compounds possessing an IC<sup>50</sup> ≤ 10 nM were assigned as potent inhibitors of HDAC1 and consequently as actives during 3D-QSAR model generation. Remarkably, to avoid possible faults arising from the inclusion in the set of molecules with uncertain activity, only molecules with experimentally definite inhibitory activity have been selected to develop the in silico model. Atom-based QSAR models were developed for ADDRR hypothesis using 259 compounds in the training set (370 compounds were randomly divided 70% in the training and 30% in the test set) and a grid spacing of 0.5 Å. QSAR models were generated by means of PLS method. An internal validation was achieved employing leave-n-out (LnO) technique as specified in Phase user manual (Phase, version 4.2, User Manual, Schrödinger press, LLC, New York, NY, 2015). As reported by Todeschini et al. the internal validation results generally expressed in terms of Q<sup>2</sup> metrics should be amended introducing Q<sup>2</sup> F3 metrics for the internal validation of the QSAR models [102]. For this purpose, we calculated these metrics (Table 2) employing the formula reported below (Equation (1)).

$$\left|Q\_{\rm F3}^2\right.\left.=\left.1-\frac{\sum\_{i=1}^{n\_{\rm OUT}} \left(y\_i-\mathfrak{H}\_{i\backslash i}\right)^2 / n\_{\rm OUT}}{\sum\_{i=1}^{n\_{\rm TR}} \left(y\_i-\overline{y}\_{\rm TR}\right)^2 / n\_{\rm TR}}\right.\tag{1}$$

where *y<sup>i</sup>* is the experimental response of the ith object, *yˆi*/*<sup>i</sup>* is the predicted response when the ith object is not in the training set, *nTR* and *nOUT* are the number of training and prediction objects, respectively, and *y¯TR* is the average value of the training set experimental responses. Moreover, to avoid overfitting/underfitting phenomena, we considered 7 factors that is an appropriate for the number of selected compounds. In fact, although there is no limit on the maximum number of factors, but as a general rule, we stopped adding factors when the standard deviation of regression is approximately equal to the experimental error (calculated as median error among the selected compounds).

#### *3.4. In Silico 3D-QSAR Model Validation*

After the generation of the 3D-QSAR model, a preliminary in silico validation was performed using a large external test set of compounds (113 molecules) selected from the literature [83,84,89,103–106] (Table S2 in the Supplementary Materials) that have not been used for generating and cross validating the model. These compounds were prepared by using Maestro, LigPrep, and MacroModel, adopting the same procedure for preparing the molecules used to derive the model. Moreover, to further assess that the chosen model with 7 factors better performs with respect to the other Phase-derived models, we applied the validation method employing the external test set to all the generated QSAR models (Table 2). This workflow established that the model with 7 factors is the best performing model of the series in predicting the activity of the external test set with a correlation coefficient r<sup>2</sup> ext\_ts = 0.794

(Figure 6) (LVs 1, r<sup>2</sup> ext\_ts = 0.421; LVs 2, r<sup>2</sup> ext\_ts = 0.698; LVs 3, r<sup>2</sup> ext\_ts = 0.657; LVs 4, r<sup>2</sup> ext\_ts = 0.712; LVs 5, r<sup>2</sup> ext\_ts = 0.735; LVs 6, r<sup>2</sup> ext\_ts = 0.787; Figures S1–S6, respectively).

Further validation of the model was done by enrichment study using decoy test [107]. For this purpose, the Enhanced (DUD-E) web server [108] was employed to generate a set of useful decoys generated from a collection of 106 active compounds from three sources: 1) active compounds used to develop the pharmacophore model, 2) other compounds with good activity against HDAC1 used in 3D-QSAR studies and 3) the most active compounds of the external test set. This collection consisted of 106 active compounds with IC<sup>50</sup> ≤ 35 nM (Table S3). For this set of active ligands, the DUD-E server provided 5764 inactive ligands (redundant structures in the output files were deleted) from a subset of the ZINC database filtered using the Lipinski's rules for drug-likeness, for a total of 5870 compounds (5764 inactives plus 106 actives). Each of these inactive decoys was selected to bear a resemblance to the physicochemical properties of the reference ligand but differ from it in terms of 2D structure (e.g., large difference of Tanimoto coefficient between decoys and active molecules). Although largely used, the approach based on decoys sets presents some limitations (i.e., the decoy sets often span a small, synthetically feasible subset of molecular space and are restricted in physicochemical similarity compared with actives). After the generation, the decoys sets were downloaded as 126 smiles files and imported into Maestro and submitted to LigPrep application to properly convert smiles into 3D structures as well as for removing potential erroneous structures. Subsequently, to perform a minimization and a conformational search of the obtained structures MacroModel program was employed (same parameters for ligands preparation were applied). A single file containing conformers of active molecules and decoys was created and submitted to Phase software for predicting the inhibitory activity of database against HDAC1 using the developed 3D-QSAR model and employing "search for matches" option. After decoys generation and activity evaluation, the Güner–Henry score, i.e., goodness of hit list (GH) and enrichment factor (EF) values were estimated by Equations (2) and (3), respectively.

$$\text{EF} = \frac{\text{Ha/Ht}}{(A/D)} \tag{2}$$

$$GH = \left\{ \frac{Ha \star (3A + Ht)}{4HtA} \right\} \star \left[ 1 - \frac{(Ht - Ha)}{(D - A)} \right] \tag{3}$$

where *Ht* represents the total number of compounds in the hit list found by virtual screening, *Ha* is the total actives found by virtual screening considering the top 30-ranked position (positions comprise within the cutoff value). The total number of compounds (*Ht*) might represent the number of molecules to purchase after a virtual screening protocol and almost the 1% of the considered database (*D*). *A* represents the total of the active derivatives enclosed in the database, and *D* stands for the total number of molecules existing in the set. The range of *GH* score varies from 0 to 1. The *GH* score 0 means a null model, while the *GH* score 1 denotes generation of an ideal model. Moreover, the % yield of actives (% *YA*) and % ratio of actives (% *RA*) were evaluated by Equations (4) and (5), respectively.

$$\%YA = \left[ \left( \frac{Ha}{Ht} \right) \ast 100 \right] \tag{4}$$

$$\%RA\ = \left[\left(\frac{Ha}{A}\right) \ast 100\right] \tag{5}$$

Moreover, to assess the predictive power of the 3D-QSAR model, a ROC was employed through an Enrichment Calculator (enrichment.py) script [55–57,76]. The mentioned script calculates the enrichment metrics, including area under the receiver-operating characteristic curve (AUC), from virtual screening by means of the output structure file and a list of known active molecules. The output of the screening protocol, using active molecules and decoys, consisted of a list of molecules ranked by the predicted activity from the top-predicted molecules as estimated by the 3D-QSAR model. These ranking data along with a list file of active molecules were submitted to the enrichment.py application.

#### **4. Conclusions**

The present study describes the generation of a ligand-based pharmacophore model (ADDRR) for a subset of 20 highly active aminophenylbenzamide derivatives reported as selective HDAC1 inhibitors by employing the software Phase implemented in the Schrödinger molecular modeling suite. With the aid of pharmacophore-based alignment rule, a meaningful 3D-QSAR model was derived and validated employing of the QSAR models a large set of benzamide-based HDAC1 inhibitors (training set, test set, and an external test set for a total of 483 molecules) by using PLS analysis. The main objective of this approach was to develop an in-house computational tool for the prediction of HDAC1 inhibitory activity during the design of innovative aminophenylbenzamide chemotypes as privileged therapeutic scaffold in the isoform-selective HDACIs research. The validation outcomes confirmed that the proposed 3D-QSAR model is endowed with satisfactory predictive power taking into account favorable structural requirements responsible for HDAC1 inhibitory activity. This aspect has been computationally investigated since the selectivity is implicit in the template molecules; however prospective validation is needed to exploit the performance of the model. In fact, the developed 3D-QSAR model can be used for rationally designing novel and selective HDACIs. Moreover, based on the computational investigation, the developed model possesses a rationale for virtual screening campaign, with huge potential in isoform-selective HDACIs drug discovery, and it can effectively provide a set of guidelines for the design and optimization of novel derivatives with greater activity towards HDAC1.

**Supplementary Materials:** The following are available online, Table S1: Experimental (Observed column) and predicted (Predicted column) activity pIC<sup>50</sup> for compounds used for developing the 3D-QSAR model., Table S2: Experimental (Observed column) and predicted (Predicted column) activity pIC<sup>50</sup> for compounds included in the external test set, Table S3: Active compounds used for generating a decoys set, Figures S1–S6: Scatter plots for all the model generated by Phase.

**Author Contributions:** Conceptualization, H.S. and S.B.; methodology, H.S., G.C. (Giulia Chemi) and S.B.; software, H.S., G.C. (Giulia Chemi) and S.B.; validation, H.S., G.C. (Giulia Chemi) and S.B.; formal analysis, H.S., G.C. (Giulia Chemi), G.C. (Giuseppe Campiani), V.C. and S.B.; investigation, H.S., G.C. (Giulia Chemi); data curation, H.S., G.C. (Giulia Chemi), G.C. (Giuseppe Campiani), V.C. and S.B.; writing—original draft preparation, H.S. and S.B.; writing—review and editing, H.S., G.C. (Giulia Chemi), S.B., G.C. (Giuseppe Campiani), V.C.; supervision, S.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Bioinformatics Research Center in Isfahan University of Medical Sciences (Iran) grant number 298137 to H.S.

**Acknowledgments:** The authors thank Lorenzo Chemi for the technical assistance during the calculation of Q<sup>2</sup> F3.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **The Study on the hERG Blocker Prediction Using Chemical Fingerprint Analysis**

#### **Kwang-Eun Choi, Anand Balupuri and Nam Sook Kang \***

Graduate School of New Drug Discovery and Development, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Korea; hwendiv@naver.com (K.-E.C.); balupuri@cnu.ac.kr (A.B.)

**\*** Correspondence: nskang@cnu.ac.kr; Tel.: +82-42-821-8626

Academic Editors: Marco Tutone and Anna Maria Almerico Received: 6 May 2020; Accepted: 2 June 2020; Published: 4 June 2020

**Abstract:** Human ether-a-go-go-related gene (hERG) potassium channel blockage by small molecules may cause severe cardiac side effects. Thus, it is crucial to screen compounds for activity on the hERG channels early in the drug discovery process. In this study, we collected 5299 hERG inhibitors with diverse chemical structures from a number of sources. Based on this dataset, we evaluated different machine learning (ML) and deep learning (DL) algorithms using various integer and binary type fingerprints. A training set of 3991 compounds was used to develop quantitative structure–activity relationship (QSAR) models. The performance of the developed models was evaluated using a test set of 998 compounds. Models were further validated using external set 1 (263 compounds) and external set 2 (47 compounds). Overall, models with integer type fingerprints showed better performance than models with no fingerprints, converted binary type fingerprints or original binary type fingerprints. Comparison of ML and DL algorithms revealed that integer type fingerprints are suitable for ML, whereas binary type fingerprints are suitable for DL. The outcomes of this study indicate that the rational selection of fingerprints is important for hERG blocker prediction.

**Keywords:** hERG toxicity; drug discovery; fingerprints; machine learning; deep learning

#### **1. Introduction**

The human ether-a-go-go related gene (hERG) or KCNH2 gene encodes a voltage-gated potassium channel known as the hERG channel. This channel plays a key role in cardiac action potential repolarization. Reduced function of hERG causes potential action prolongation and increases the risk for potentially fatal ventricular arrhythmia, torsades de pointes. Therefore, preclinical hERG testing is essential in the drug discovery process to avoid cardiac toxicity [1]. Recently, many drugs such as astemizole, terfenadine, cisapride, thioridazine, grepafloxacin and sertindole were withdrawn from the market due to undesired cardiotoxicity effects [2,3]. Nowadays, the US Food and Drug Administration (FDA) demands in vitro hERG assay of lead compounds prior to clinical trials [4]. The side effects of unexpected hERG channel binding by drug candidates are a major challenge in the drug discovery process [5]. The development of an accurate prediction model for hERG channel blockers is crucial in the early stages of drug discovery and development.

Although the electron microscopy structure of the membrane protein hERG is known [6], its X-ray crystal structure is not available. Thus, structure-based hERG blocker prediction is challenging. However, a few structure-based hERG blocker predictions were attempted with homology modeling using structures of the related potassium ion channels as templates [7]. Several researchers have applied a ligand-based drug design approach to identify hERG blockers. They used various machine learning (ML) algorithms such as naïve Bayes (NB) [8], support vector machine (SVM) [9], random forest (RF) [10] and k-NN [11]. Sun et al. reported an NB model with a receiver operating characteristic (ROC) value of 0.87 on the basis of 1979 compounds [12]. Jia et al. applied SVM methods and

atom-type descriptors without fingerprints on 1043 compounds to develop a model with accuracy of 0.94 [9]. Yap et al. developed a model with accuracy of 0.97 based on 310 compounds using similar methodology [13]. Marchese Robinson et al. built an RF model with Matthews correlation coefficient (MCC) of 0.83 using a dataset of 368 compounds. They used the extended-connectivity fingerprint (ECFP\_4) for developing the model [14]. Kim et al. developed a model with accuracy of 0.96 using the RF algorithm on a dataset of 293 compounds [15]. Chavan et al. reported a k-NN model with accuracy of 0.55 on the basis of 1967 compounds [11]. The deep learning (DL) approach with artificial neural networks (ANN) has also been used to predict hERG blockers [16,17]. Cai et al. applied DL to 7889 compounds and obtained accuracy of 0.93 and an area under the receiver operating characteristic curve (AUC) value of 0.97 for the best model [17]. Zhang et al. applied DL to 1871 compounds and developed a model with accuracy of 0.78 [18]. Several research papers have reported the application of ML or DL techniques on hERG blockers. However, smaller datasets were used in the previous publications as compared to the recent papers utilizing the DL approach. Datasets for hERG blockers have grown in recent years.

Fingerprints used in the former ligand-based drug design studies on hERG as well as other targets were mostly binary types representing 0 or 1 [19–22]. Fingerprints describe chemical substructures as numerical category data. The number of a binary type fingerprint is usually restricted to 1024 bits [23]. The integer type fingerprints describe chemical structures in more detail with various substructures. They produce a large number of category data types useful for ML. Although integer type fingerprints indicate diversity of chemical substructures, they are limited to commercial software such as the Pipeline Pilot (PP) module of Discovery Studio (DS) software (BIOVIA, San Diego, CA, USA) [24]. In the present study, we calculated integer as well as binary type fingerprints for a dataset of hERG blockers. Integer type fingerprints were computed using PP and they included extended-connectivity fingerprints (ECFP\_2, ECFP\_4 and ECFP\_6) and functional-class fingerprints (FCFP\_2, FCFP\_4 and FCFP\_6). Binary type chemistry development kit (CDK) fingerprints (standard, extended and graph) were computed using the R package. We used both types of fingerprints for ML and DL. Our results show that rational selection of fingerprints is important for hERG blocker prediction. We have discussed the advantages and disadvantages of integer and binary type fingerprints in ML and DL.

#### **2. Results**

#### *2.1. Dataset Splitting*

As discussed in the methodology section, a dataset consisting of hERG blockers was divided into the training and test sets in the ratio 4:1. The training and test sets consisted of 3991 and 998 compounds, respectively. Principle component analysis (PCA) was performed to verify the diversity of the chemical space of the dataset. PCA analysis with eight common descriptors, including partition coefficient (AlogP), molecular weight (MW), hydrogen-bond donor (HBD), hydrogen-bond acceptor (HBA), rotatable bond number (RBN), number of rings (Num Rings), number of aromatic rings (Num Arom Rings) and molecular fractional polar surface area (MFPSA), showed high chemical diversity of the compounds within the training and test sets (Figure 1). Generally, PCA descriptors are indirect numeric type representations of chemical structures. However, fingerprints representing chemical substructures are not numeric type. They are Boolean type indicating existence or nonexistence (1 or 0) of a unique fingerprint. Therefore, additional integer and binary type fingerprints were computed to validate the division of the dataset. As shown in Figure 2, integer type PP fingerprints (ECFP\_6 and FCFP\_6) and binary type CDK fingerprints (standard and extended) showed similar fingerprint frequency for the training and test sets. Binary type CDK fingerprints displayed relatively lower fingerprint frequency than integer type PP fingerprints. This is due to the limited size of the binary type CDK fingerprints (1024 bits). PCA analysis and fingerprint frequency calculations validate the appropriate splitting of the dataset into training and test sets.

**Figure 1.** Principle component analysis (PCA) for the training and test sets of hERG blockers. Independent variables included partition coefficient (AlogP), molecular weight (MW), hydrogen-bond donor (HBD), hydrogen-bond acceptor (HBA), rotatable bond number (RBN), number of rings (Num Rings), number of aromatic rings (Num Arom Rings) and molecular fractional polar surface area (MFPSA). Blue and yellow spheres indicate training and test sets, respectively.

**Figure 2.** Fingerprint frequency of the training and test sets of hERG blockers. X-axis denotes fingerprint identifier whereas Y-axis denotes the fingerprint frequency. Representative integer type PP fingerprints: (**A**) ECFP\_6 and (**B**) FCFP\_6. Representative binary type chemistry development kit (CDK) fingerprints: (**C**) standard and (**D**) extended. For validating the appropriate split of the dataset into training and test sets, the top 20 of PP integer fingerprint identifiers were selected as representatives depending on the inclusion order. Furthermore, the top 20 of CDK binary fingerprint identifiers were chosen as representatives depending on the binary sequence (1024 bits). Fingerprint frequency was calculated as the observed number of unique fingerprint identifiers/the total number of chemical data (training set = 3991 and test set = 998).

#### *2.2. Model Building and Evaluation*

The training set comprising 3991 compounds was used to build the models and the test set consisting of 998 compounds was used to evaluate the performance of the developed models. Different integer and binary type fingerprints were utilized for model building (Figure 3 and Supplementary materials Tables S1 and S2). All control models which lacked fingerprints and included only descriptors (AlogP, MW, HBD, HBA, RBN, Num Rings, Num Arom Rings and MFPSA) as features showed average predictive accuracy (Q) of 0.77 (ML = 0.79 and DL = 0.75) and an average area under the receiver operating characteristic curve (AUC) value of 0.82 (ML = 0.82 and DL = 0.81). The addition of fingerprints in the features improved the performance of the models. All models including integer type PP fingerprints (ECFP\_2, FCFP\_2, ECFP\_4, FCFP\_4, ECFP\_6 and FCFP\_6) displayed average Q

and AUC values of 0.87 (ML = 0.86 and DL = 0.88) and 0.92 (ML = 0.91 and DL = 0.93), respectively. Integer type fingerprints were converted to binary type using the "Convert Fingerprint" module of PP. The conversion of integer type to binary type fingerprints decreased the performance of the models. All models including converted binary type PP fingerprints demonstrated average Q and AUC values of 0.77 (ML = 0.75 and DL = 0.81) and 0.81 (ML = 0.77 and DL = 0.84), respectively. The open-source based CDK fingerprints (standard, extended and graph) are originally binary type and cannot be converted to integer type. Models with original binary type CDK fingerprints showed a comparatively higher performance than converted binary type PP fingerprints. All models, including binary type CDK fingerprints, exhibited average Q and AUC values of 0.86 (ML = 0.83 and DL = 0.89) and 0.90 (ML = 0.87 and DL = 0.93), respectively. Integer type PP fingerprints exhibited a higher performance than original binary type CDK fingerprints for ML. However, integer type PP fingerprints demonstrated a slightly lower performance than original binary type CDK fingerprints for DL. Compared to integer type PP fingerprints and original binary type CDK fingerprints, converted binary type PP fingerprints showed lower performance for both ML and DL. Comparison of different algorithms revealed that the RF algorithm produced better models than others. Models derived using the RF algorithm exhibited average Q and AUC values of 0.90 and 0.95, respectively, for both integer type PP fingerprints as well as original binary type CDK fingerprints. Furthermore, the RF algorithm produced the best model with FCFP\_2 integer type PP fingerprint. This model showed Q and AUC values of 0.91 and 0.95, respectively.

**Figure 3.** Models built using integer and binary type fingerprints for hERG blockers. X-axis denotes fingerprint types whereas Y-axis denotes the performance. Control lacking fingerprint only include descriptors, I: integer type PP fingerprints, CB: converted binary type PP fingerprints, OB: original binary type CDK fingerprints, NB: naïve Bayes, SVM: support vector machine (L: linear, P: polynomial, R: radial), RF: random forest, ANN: artificial neural network (layer size 100, 200, and 400) for deep learning. (**A**,**B**) Accuracy value for the models using machine learnings (**A**) and deep learning (**B**). (**C**,**D**) AUC value for machine learnings (**C**) and deep learning (**D**).

#### *2.3. External Validation*

Two external sets were used for further validation of the models. As discussed in the methodology section, external set 1 (Ex-1) from ChEMBL database and external set 2 (Ex-2) from recent publications [25–29] consisted of 263 and 47 hERG blockers, respectively. Model predictions for Ex-1 and Ex-2 using integer

and binary type fingerprints are summarized in Figure 4 and Tables S3–S6 (Supplementary materials), respectively. The control models exhibited an average Q value of 0.79 (ML = 0.80 and DL = 0.78) for Ex-1. The Ex-1 showed average Q values of 0.87 (ML = 0.86 and DL = 0.88), 0.80 (ML = 0.77 and DL = 0.82) and 0.86 (ML = 0.83 and DL = 0.88) for integer type PP fingerprints, converted binary type PP fingerprints and original binary type CDK fingerprints, respectively. Ex-1 displayed higher predictive accuracy for both integer type PP fingerprints and original binary type CDK fingerprints than for the control models. However, integer type PP fingerprints produced comparatively higher predictive accuracy than original binary type CDK fingerprints. Converted binary type PP fingerprints showed predictive accuracy similar to the control models. Integer type PP fingerprints exhibited higher predictive accuracy than original binary type CDK fingerprints for ML. However, integer type PP fingerprints exhibited predictive accuracy similar to the original binary type CDK fingerprints for DL. Control models exhibited an average Q value of 0.76 (ML = 0.79 and DL = 0.73) for Ex-2. The Ex-2 showed average Q values of 0.80 (ML = 0.80 and DL = 0.80), 0.71 (ML = 0.71 and DL = 0.70) and 0.70 (ML = 0.72 and DL = 0.68) for integer type PP fingerprints, converted binary type PP fingerprints and original binary type CDK fingerprints, respectively. Compared to the control models, Ex-2 displayed higher predictive accuracy for integer type PP fingerprints while it showed lower predictive accuracy for original binary type CDK fingerprints and converted binary type PP fingerprints. Integer type PP fingerprints exhibited higher predictive accuracy for both ML and DL as compared to the original binary type CDK fingerprints and converted binary type PP fingerprints.

**Figure 4.** Model prediction to external sets. X-axis denotes fingerprint types whereas Y-axis denotes the performance for accuracy. Control lacking fingerprint include only descriptors, I: integer type PP fingerprints, CB: converted binary type PP fingerprints, OB: original binary type CDK fingerprints, NB: naïve Bayes, SVM: support vector machine (L: linear, P: polynomial, R: radial), RF: random forest, ANN: artificial neural network (layer size 100, 200, and 400) for deep learning. (**A**,**B**) Ex-1 prediction for machine learnings (**A**) and deep learning (**B**). (**C**,**D**) Ex-2 prediction for machine learnings (**C**) and deep learning (**D**).

In accordance with the model-building results, converted binary type PP fingerprints reduced the predictive accuracy of the models. Original binary type CDK fingerprints exhibited comparatively higher predictive accuracy than converted binary type PP fingerprints for Ex-1. However, Ex-2 displayed slightly lower predictive accuracy for original binary type CDK fingerprints as compared to converted binary type PP fingerprints. This might be due to the small data size of the Ex-2

(47 compounds). Further parameters for external validation of the developed models are provided in Tables S3–S6 (Supplementary materials). These included true positive (TP), true negative (TN), false positive (FP), and false negative (FN). It can be seen in the supplementary tables that TN predictions were higher than FN. This might be because compounds in the whole dataset as well as the external sets were intended to develop as inhibitors for other drug targets but they also inhibited hERG, to produce toxicity. During model building, the RF algorithm displayed the same average Q value of 0.90 for both integer type PP fingerprints as well as original binary type CDK fingerprints. However, the RF model predictions for Ex-1 and Ex-2 using integer type PP fingerprints were found to be better than original binary type CDK fingerprints. The RF models with integer type PP fingerprints showed average Q values of 0.89 and 0.79 for Ex-1 and Ex-2, respectively. The RF models with original binary type CDK fingerprints exhibited average Q values of 0.88 and 0.74 for Ex-1 and Ex-2, respectively. The best model that was obtained using the RF algorithm and FCFP\_2 integer type PP fingerprint showed Q values of 0.90 and 0.81 for Ex-1 and Ex-2, respectively.

#### **3. Discussion**

The unwanted blockage of the hERG channel by drug candidates could lead to fatal cardiotoxicity. Thus, it is essential to screen compounds for activity on hERG channels early in the drug discovery process to decrease the risk of a drug candidate failing in preclinical safety studies. Due to the unavailability of the crystal structure of the hERG channel, researchers mainly use the ligand-based drug design approach to identify hERG blockers. Several ML and DL approaches have been applied. However, fingerprints used in the majority of the previous studies were mostly binary type [30,31]. These fingerprints are limited in size and generally restricted to 1024 bits. The integer type fingerprints describe chemical structures in more detail, but they are limited to commercial software. Fingerprints of the integer type have not been explored much for the prediction of hERG blockers or for the prediction of other drug target inhibitors. In this study, we computed both integer and binary type fingerprints for a dataset of hERG blockers and evaluated various ML (NB, SVM, RF and Bagging) and DL (ANN) algorithms. A training set of 3991 compounds was used to develop QSAR models. The performance of the developed models was evaluated using a test set of 998 compounds. Models were further validated using two external sets (Ex-1: 263 compounds and Ex-2: 47 compounds).

The overall results showed that addition of fingerprints to the descriptors improved the performance of the models. Models with integer type PP fingerprints displayed slightly better performance than models with binary type CDK fingerprints. Conversion of integer type PP fingerprints to binary type PP fingerprints reduced the performance of the models. Except for small-sized Ex-2, models with binary type CDK fingerprints showed better performance than converted binary type PP fingerprints. Comparison of different algorithms revealed that models built by RF outperformed those built by other algorithms. This is in agreement with a previous study that reported high performance of the RF models [17]. The RF algorithm demonstrated a similar performance for both integer type PP fingerprints as well as original binary type CDK fingerprints during model building. However, the RF models with integer type PP fingerprints showed comparatively better predictions for both the external sets than the RF models with original binary type CDK fingerprints. The best model was obtained using the RF algorithm and FCFP\_2 integer type PP fingerprint. Comparison of ML and DL algorithms revealed that ML models with integer type PP fingerprints demonstrated better performance and predictions than ML models with binary type CDK fingerprints. On the other hand, DL models performed slightly better with binary type CDK fingerprints as compared to DL models with integer type PP fingerprints. However, DL models exhibited similar predictions for both integer type PP fingerprints and binary type CDK fingerprints for Ex-1. In the case of small-sized Ex-2, ML and DL models showed better predictions with integer type PP fingerprints as compared to ML and DL models with binary type CDK fingerprints.

The outcomes of the study suggested that integer type fingerprints improved the performance and predictive ability of QSAR models. Although integer type fingerprints could be applied to both ML and DL, these fingerprints did not improve the predictive accuracy of DL models significantly. Moreover, they required long computation time due to the large number of features. Numerous fingerprint identifiers caused memory problems in some ANN packages such as "nnet". Due to memory problems, integer type fingerprints needed to be converted to binary type for DL. Our results demonstrated that conversion of fingerprints from integer to binary type reduced the performance and the predictive ability of the models in both ML and DL. Compared to integer type fingerprints, original binary type fingerprints produced DL models with slightly better performance. Furthermore, the predictive ability of DL models with binary type fingerprints was comparable to DL models with integer type fingerprints. Accordingly, binary type fingerprints are recommended for DL. Binary type fingerprints were suitable for both ML and DL due to their limited size. However, binary type fingerprints produced ML models with comparatively lower performance and predictive ability than ML models with integer type fingerprints. Consequently, integer type fingerprints are recommended for ML. In conclusion, rational selection of fingerprints is important for hERG blocker prediction.

#### **4. Materials and Methods**

#### *4.1. Dataset*

A dataset consisting of 5252 compounds with hERG inhibition values (IC<sup>50</sup> and K<sup>i</sup> ) was obtained from the ChEMBL database [32]. The reported K<sup>i</sup> values were converted into IC<sup>50</sup> values. In accordance with previous studies, compounds with IC<sup>50</sup> ≤ 1 µM and IC<sup>50</sup> > 10 µM were classified as active and inactive compounds, respectively [15,16]. Active and inactive compounds were defined as 1 and 0, respectively. Prior to splitting the dataset into training and test sets, 263 compounds (5% of the dataset) were extracted by random selection as Ex-1 using DS 2019 software (BIOVIA, San Diego, CA, USA). The remaining dataset with 4989 compounds was randomly partitioned into training and test sets in the ratio 4:1. The training and test sets consisted of 3991 and 998 compounds, respectively. In addition to the ChEMBL dataset, 47 hERG blockers reported in recent research papers were collected as Ex-2 from recent publications [25–29]. Active and inactive compounds for the external set 2 (Ex-2) were defined in the same way as discussed for the ChEMBL dataset (Table 1). The training set was used for the model building whereas the test and external sets (Ex-1 and Ex-2) were used for model evaluation.


**Table 1.** Dataset used in the present study.

#### *4.2. Fingerprint Calculation*

The PP integer type fingerprints were computed using the "molecular fingerprint" module of PP. The "atom abstraction" option was set to atom type and functional class for calculating ECFP and FCFP fingerprints, respectively. The "maximum distance" option was set to 2, 4 and 6 for generating different types of ECFP and FCFP fingerprints (ECFP\_2, FCFP\_2, ECFP\_4, FCFP\_4, ECFP\_6 and FCFP\_6). Integer type fingerprints were converted to binary type fingerprints using the "Convert Fingerprint" module of PP. CDK binary type fingerprints (standard, extended and graph) were computed with "rcdk" package [33] of R (version 3.5.2, R Core Team, Vienna, Austria) [34]. The "get.fingerprint" function was used for the calculation of the fingerprints. The fingerprint frequency was determined for validating the appropriate split of the dataset into training and test sets. Fingerprint frequency was

calculated as the observed number of unique fingerprint identifiers/the total number of chemical data (training set = 3991 and test set = 998).

#### *4.3. Model Building*

In this study, we evaluated one DL algorithm: ANN, and four ML algorithms: NB, SVM, RF and bagging. DS 2019 software was used for NB and bagging because it supports only these two algorithms. R package was employed for NB, SVM, RF, ANN. NB is available in both R and DS packages. NB-DS computes only integer type fingerprints but it is possible to calculate both integer as well as binary type fingerprints using NB-R. We utilized both the packages to evaluate NB. The "create Bayesian model" protocol of DS was used for NB-DS. The laplacian value was set to the default value of 1. The "create recursive partitioning model" protocol of DS was employed for decision-tree based bagging such as RF. The number of trees was set to 100. The "e1071" package [35] of R was utilized for NB and SVM. NB models were created using the "naïveBayes" function. The Laplace parameter was set to 1. SVM models were developed using the "svm" function. The linear, polynomial and radial methods were implemented for SVM. While implementing, linear, polynomial and radial kernel functions were selected. The "randomForest" package [36] of R was used for a decision-tree based RF. The number of trees was set to 100. The "h2o" package [37] of R was utilized for ANN with three hidden layers. The "Rectifier activation" function was employed, and the number of iterations was set to 20. The layer size parameter was set to 100, 200, and 400 for three hidden layers. Descriptors were calculated using DS. These included AlogP, MW, HBD, HBA, RBN, Num Rings, Num Arom Rings and MFPSA. Except for the control models, all developed models included fingerprints. Control models comprised only descriptors.

#### *4.4. Model Evaluation*

Model performance was evaluated in terms of predictive accuracy (Q), the area under the receiver operating characteristic curve (AUC), true positive (TP), true negative (TN), false positive (FP) and false negative (FN). The "calculate molecular property" protocol of DS 2019 software and "predict" function of R package were used for the model evaluation.

**Supplementary Materials:** The following are available online. Table S1. Models built using integer type fingerprints. Table S2. Models built using binary type fingerprints. Table S3. Model prediction for external set 1 using integer type fingerprints. Table S4. Model prediction for external set 1 using binary type fingerprints. Table S5. Model prediction for external set 2 using integer type fingerprints. Table S6. Model prediction for external set 2 using binary type fingerprints. The supplementary excel data for the compound list and detailed model prediction data used in this research for hERG blocker prediction.

**Author Contributions:** Conceptualization, N.S.K.; methodology, K.-E.C.; software, K.-E.C.; validation, N.S.K., K.-E.C. and A.B.; formal analysis, K.-E.C. and A.B.; investigation, K.-E.C.; resources, N.S.K.; data curation, K.-E.C.; writing—original draft preparation, K.-E.C.; writing—review and editing, N.S.K. and A.B.; visualization, K.-E.C.; supervision, N.S.K.; project administration, N.S.K.; funding acquisition, N.S.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, NRF-2020R1A2C100691511 and NRF-2016M3A9E4947695.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Sample Availability:** Samples of the compounds are not available from the authors.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Molecules* Editorial Office E-mail: molecules@mdpi.com www.mdpi.com/journal/molecules

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-2778-9