**1. Introduction**

The liver is a complicated organ, and one of its main functions is to regulate lipid homeostasis through the interplay of various hormones, nuclear receptors, and transcription factors [1]. This includes conducting synthesis of new lipids, coordinating their transport to other parts of the body, and utilizing them as energy substrates [2]. Any imbalance between these pathways can result in the accumulation of fat. In the absence of alcohol consumption, non-alcoholic fatty liver disease (NAFLD) is characterized by the build-up of ectopic fat in the liver, also known as steatosis [3]. NAFLD is the current liver pandemic sweeping the globe. It is estimated that about 25% of the world's population is affected by NAFLD [4]. There is a very close association with dyslipidemia, type 2 diabetes, central obesity, and metabolic syndrome, each with an occurrence of 69%, 23%, 51%, and 43%, respectively [4]. Consequently, the rate of NAFLD diagnoses has been steadily rising with the ever-increasing obesity burden. Although the prevalence of this disease is so high, public knowledge of its existence and effects is still lacking.

More information is constantly being revealed on the pathophysiology of NAFLD, however, the general concept is that more lipids are being retained than the hepatocyte is able to expel. The four major pathways regulating lipid acquisition and disposal include uptake of free circulating fatty acids, de novo lipogenesis, fatty acid oxidation and transportation of lipid to the outside of liver as very low-density lipoproteins (VLDL) [5] (Figure 1). Focusing on the uptake of circulating lipids, it is directly proportional to the concentration of plasma free fatty acids (FFAs) that are mainly derived from the body's supply of adipose tissue [6]. This mechanism will be explored more in terms of genetic modifications that can be made to lipid transport proteins. Lipid accumulation can be characterized as either microsteatosis or macrosteatosis. Microsteatosis represents small lipid droplets that do not displace the cell's nucleus, and these livers generally are not associated with many complications. Macrosteatosis, which involves displacement of the nucleus, is a condition that can range from simple lipid build-up with minimal effects to

**Citation:** Young, E.N.; Dogan, M.; Watkins, C.; Bajwa, A.; Eason, J.D.; Kuscu, C.; Kuscu, C. A Review of Defatting Strategies for Non-Alcoholic Fatty Liver Disease. *Int. J. Mol. Sci.* **2022**, *23*, 11805. https:// doi.org/10.3390/ijms231911805

Academic Editors: Patrick C. Baer and Ralf Schubert

Received: 10 September 2022 Accepted: 28 September 2022 Published: 5 October 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

non-alcoholic steatohepatitis (NASH) that could result in fibrosis, cirrhosis, and eventual end-stage liver disease [3]. With these complications in mind, one can imagine how NAFLD has become a leading cause for liver transplantation as well as the reason for many livers not being suitable for donation. Donor liver steatosis is a significant risk factor for posttransplant complications due to its increased susceptibility to ischemia reperfusion injury (IRI) which could result in post-operative graft dysfunction, graft loss and requirement for re-transplantation [7].

**Figure 1.** Major pathways regulating lipid acquisition and disposal inside the liver. Uptake of free circulating lipids and de novo lipogenesis increase the amount of fat in liver cells. Several fatty acid oxidation mechanism and transport of fat as low density lipoproteins (VLDL) decrease the amount of fat inside the liver cells (Created with BioRender.com).

Regardless of how fat accumulates in the hepatocytes, its presence can result in endoplasmic reticulum, oxidative, and mitochondrial stress along with impaired autophagy [3]. The two-hit hypothesis was proposed in 1998 by Day and James to explain the pathogenesis of NAFLD [8]. According to their model, the "first hit" occurs when triglycerides (TGs) build-up in the hepatocytes. Inflammation and necrosis follow in the lipid-filled hepatocytes when the TGs undergo peroxidation, thus signifying the "second hit". With the progression of more cellular damage, NAFLD can evolve into NASH that is characterized by increased amounts of inflammation. Of the patients with NASH, approximately 5–18% develop cirrhosis, and when combined with fibrosis, about 38% develop cirrhosis [9]. Patients with NASH and fibrosis or cirrhosis are at an increased risk of hepatocellular carcinoma.

A large amount of research has gone into understanding the pathophysiology of NAFLD in order to find ways to prevent and possibly reverse fat accumulation in hepatocytes. This paper will review current in vitro defatting techniques and machine perfusion therapy along with exploring possible genomic approaches that could affect fatty acid transport into hepatocytes as well as de novo lipogenesis. Targeting specific genes and proteins involved in the mechanism of steatosis could be a potential avenue of future treatment for both donors and those on the transplant list.

#### **2. In Vitro Defatting Techniques**

Over the years, there have been several studies conducted to determine the effects of in vitro defatting strategies. In 2013, Nativ et al. explored the effect of macrosteatosis reduction approaches on lipid droplet size as well as hepatocyte viability and functions specific to the liver [10]. In this experiment, primary hepatocytes from lean Zucker rats were cultured in a medium with equal amounts of both oleic and linoleic acids to induce steatosis [10–12]. Then, forskolin (a glucagon mimetic), the PPARα agonists (GW7647, GW501516), scoparone, hypericin (a pregnane X receptor [PXR] ligand), visfatin (an adipokine), and amino acids were used as a cocktail for defatting purposes [9]. This cocktail had been shown to reduce macrosteatosis by activating hepatocellular TG metabolism [12]. This study showed that macrosteatosis can be inducible in primary hepatocytes. They also confirmed that macrosteatosis can be reversable with SRS (steatosis reduction supplements). They also concluded that accelerated macrosteatotic reduction led to a faster recovery of urea secretion and bile canalicular formation with the same viability as seen in lean rat hepatocytes. These results indicate that hepatocyte functional recovery is dependent on either macrosteatosis reduction time or direct effects of the SRS.

Nativ et al. conducted another study to determine the sensitivity of hepatocytes to hypoxia/reoxygenation (H/R) stress after exposure to defatting protocol [13]. They used the same protocol including SRS enriched with L-carnitine. They showed that the amount of steatosis is correlated with vulnerability to H/R injury, and lean or microsteatotic hepatocytes are resistant to injury. These macrosteatotic hepatocytes have less ability to produce ATP and that is related to increased production of reactive oxygen species. Lean and microsteatotic hepatocytes also showed improved activity in liver function like urea secretion and bile canalicular transport compared to those with macrosteatosis. They recommended that the lipid lowering defatting agents and combinations could be beneficial to overcome H/R stress and provide a possible recovery mechanism for discarded macrosteatotic liver grafts [13].

Another experiment conducted in 2016 by Yarmush et al. examined the use of a defatting cocktail, as described by Nativ et al. [10,13], on HepG2 cells with induced steatosis [14]. They performed their experiment under normoxic and hyperoxic conditions. They showed that TG storage levels decreased along with an increase in beta-oxidation, the tricarboxylic acid cycle, and the urea cycle. All these parameters were augmented by SRS and hyperoxic conditions. They also found that the rate of extracellular glucose uptake was miniscule compared to the amount supplied by glycogenolysis within the cell which has also active glycolysis. In conclusion, both glycolysis and beta-oxidation were occurring at the same time, which is not expected under normal conditions. Typically, glycolysis occurs when the body is in the fed state while beta-oxidation occurs during fasting. However, the combination of defatting agents does not mimic any known metabolic conditions, so the unusual result could be anticipated.

In 2018, Boteon et al. focused to determine the effect of defatting agents on primary human liver cells [15]. This was the first study to use primary human hepatocytes (PHH), which were isolated from discarded donor livers in this type of experiment. PHH cells were incubated with standard medium supplemented with FFAs, consisting of palmitic, linoleic, and oleic acids. Forskolin, GW7647, hypericin, scoparone, GW501516, visfatin, and L-carnitine were added and incubated for 48 h. They successfully reduced the intracellular lipid accumulation by 54% and TG levels by 35%. Furthermore, production of ketone bodies was increased, indicating that beta-oxidation was occurring at a higher rate. In cytotoxicity experiments, they used human intra-hepatic endothelial cells (HIEC) and human cholangiocytes, and the viability was measured for all cultures, including PHH. They demonstrated an 11% increase in viability of PHH treated with defatting drugs compared to the fatty control group. Moreover, there was no difference in viability between the treated and control groups of HIEC while the viability of human cholangiocytes improved, although this result was not statistically significant. It was the first study to prove that

defatting drug cocktails have efficacy in reducing lipid content of PHH while causing no harm to non-parenchymal cells.

Recently, Aoudjehane and colleagues used a novel defatting cocktail in 3 different human culture models: PHH with induced steatosis, PHH isolated from a steatotic liver, and precision-cut liver slices (PCLS) from a steatotic liver [16]. They used a similar defatting cocktail (including forskolin, L-carnitine, and a PPARα agonist GW7647) in addition to two new agents, rapamycin and necrosulfonamide. Rapamycin is an immunosuppressant that can reduce steatosis by inhibiting mammalian target of rapamycin (mTOR). This action promotes lipogenesis, TG secretion, and macro-autophagy [17–19]. Necrosulfonamide (NSA) is an inhibitor of an effector in the necroptosis pathway that recently was revealed as a regulator of TG storage in the liver [20]. The new cocktail showed a significant decrease in lipid droplets and TG levels in steatosis-induced PHH, PHH isolated from fatty livers and also in PCLSs. Additionally, they reported a reduction in endoplasmic reticulum stress and reactive oxygen species production. By using PCLS, this study was the first to demonstrate defatting agents are successful in a model that maintains the 3D structure of liver tissue and the interactions between hepatocytes and other liver cell types. Table 1 summarizes chemicals that have been used in in vitro experiments.


**Table 1.** Summary of In Vitro Defatting Techniques.

#### **3. Machine Perfusion Defatting Techniques**

*3.1. Preclinical Studies*

The detrimental effects of static cold storage on transplantable livers have produced a need for a more efficient strategy to store organs for future use, such as ex vivo machine perfusion. Ex vivo machine perfusion has been shown to not only reduce storage injury, but also provides an opportunity to treat damaged livers before transplantation. Much research has gone into this field over recent years leading it to become an alternate strategy to reduce preservation injury [21].

Bessems et al. conducted a study to compare cold storage versus machine perfusion [21]. They induced macrosteatosis in a rat model by feeding it a methionine and choline-deficient diet before using either hypothermic cold storage or machine perfusion (4 ◦C) for 24 h. They reported that machine perfused livers had significantly less damage as well as higher bile production, ammonia clearance, urea production, oxygen consumption, and ATP levels compared to static cold storage samples. Additionally, Kron et al. tested similar experiment to determine the viability of machine perfused grafts after transplantation [22]. After inducing macrosteatosis, livers were transplanted after either <1 h of cold storage, 12 h

of cold storage, 12 h of cold storage followed by 1 h of hypothermic oxygenated perfusion (HOPE) or 12 h of cold storage followed by 1 h of hypothermic nonoxygenated perfusion (HNPE). They reported that HOPE therapy before transplantation resulted in significantly decreased reperfusion injury evidenced by less oxidative stress, nuclear injury, macrophage activation and fibrosis after one week. However, these protective effects were lost with the absence of oxygen in the perfusate, and this study did not find any reduction in the level of steatosis after HOPE therapy. Early trials of normothermic (37 ◦C) machine perfusion with an oxygenated blood-based perfusion system showed that a reduction in steatosis could be achieved in porcine models [23]. Even without the addition of a pharmacologic defatting cocktail, Jamieson et al. observed a 13% reduction in diet-induced liver steatosis after 48 h of normothermic ex vivo machine perfusion. However, other studies have since shown that the use of defatting cocktails further enhances the effect of machine perfusion.

Nagrath et al. showed results describing a 65% reduction in hepatocyte TG content after normothermic perfusion with a defatting cocktail (PPARα ligand (GW7647), a PPARδ ligand (GW501516), hypericin, scoparone, forskolin and visfatin) for three hours [12]. They concluded that more oxygen availability led to an increase in beta-oxidation and subsequent reduction in steatosis. Although, normothermic perfusion alone still reduced TG content by 30%, demonstrating the inherent defatting capabilities of this technique. They further tested if subnormothermic (20 ◦C) perfusion with the same defatting cocktail [12] would have an equivalent effect while being easier to maintain [24]. The results revealed no significant reduction in intracellular lipid content after six hours of perfusion in livers from obese Zucker rats. This indicates that higher temperatures may be required for cellular metabolism, specifically for beta oxidation, to take place at a high rate.

Another novel mechanism for liver defatting was described by Vakili et al. and involves the use of glial cell line-derived neurotrophic factor (GDNF) [25]. They had previously explained the mechanism behind the protective effect of GDNF against high-fat diet-induced steatosis in mice by reducing PPARγ expression [26] and wanted to test the potential of it before transplantation. Steatotic and lean donor livers were both perfused with either the vehicle, GDNF, or the same defatting cocktail described previously. They found that GDNF was equally effective as the defatting agents [12] at reducing TG content in hepatocytes (>40% reduction); however, GDNF induced less liver damage than the defatting cocktail, indicated by a significant rise in lactate dehydrogenase activity. Moreover, GDNF may prove to be a more suitable option for treating livers ready for transplant.

In a recent study, Raigani et al. demonstrated how a defatting cocktail in normothermic perfusion improves markers of cell viability in mouse model [27]. They implemented the same defatting cocktail supplemented with L-carnitine and amino acids [12]. They showed a reduction in perfusate lactate and better bile quality along with a decrease in inflammatory markers such as tumor necrosis factor-α (TNFα), NF-κB, and apoptosis markers, specifically caspase-3 and Fas cell surface death receptor. The results also showed an increase in gene expression of mitochondrial beta-oxidation markers; however, there was not a significant reduction in hepatocyte steatosis. They concluded that perhaps a clinical improvement in liver function is more important than decreasing lipid content alone. Table 2 summarize the preclinical studies.

#### *3.2. Clinical Trials*

Although much work has been done dealing with fatty livers in animal models, researchers are still working out the kinks when conducting studies on human livers. In 2010, Guarrera and colleagues first experimented with human livers to see if hypothermic machine perfusion (HMP) would preserve the organs better than cold storage, and they found that HMP is promising to use before transplantation and may improve graft function depending on early biochemical markers [28]. Other studies have looked at using hypothermic machine perfusion to assess the quality of liver grafts [29,30]. Both groups used human livers, some potentially transplantable and some discarded mainly due to steatosis. The studies found that damaged livers released higher levels of injury markers

such as AST, ALT, and lactate dehydrogenase (LDH). Discarded donor livers also had a lower ATP recovery rate compared to the potentially transplantable group. At hypothermic temperatures, both groups also found that the morphology was preserved with or without oxygen supplementation.


**Table 2.** Summary of Machine Perfusion Techniques–Preclinical Studies.

Clinical trials using HOPE to preserve livers destined for transplant have also been underway in recent years. These studies evaluated liver function in transplant patients after their organ was preserved using HOPE or static cold storage. One study specifically looked at the risk of non-anastomotic biliary strictures (NAS) and found that livers preserved with HOPE had significantly less occurrences of biliary strictures, post reperfusion syndrome, and early allograft dysfunction [31]. Two other studies have also used HOPE to preserve grafts but observed differences in liver enzymes and function after transplantation [32,33]. Both groups found that patients who received livers preserved using HOPE had significantly lower levels of liver injury enzymes, graft dysfunction, 90-day complications and hospital stay. These results are very promising for the future of liver transplantation.

A great deal of research has also gone into the effects of normothermic machine perfusion on liver graft quality. There have been several studies specifically using normothermic perfusion to assess graft viability by evaluating markers like injury enzymes, lactate clearance, and bile production [34–36]. These studies were performed on previously discarded livers and determined that they were suitable for transplant after normothermic perfusion. Grafts perfused before transplant typically resulted in better patient outcomes and less post-surgery complications. There has also been more interest in normothermic perfusion for treating hepatic steatosis. For example, Liu et al. investigated changes in the lipid profiles on ten perfused livers [37]. They found that perfusate TG levels significantly increased from 1-h to 24-h of treatment; however, there was a decrease in perfusate levels of total cholesterol, high-density lipoproteins, and low-density lipoproteins. Additionally, there was no significant decrease in steatosis histologically. They believed their findings to be due to differences in hepatic morphology compared to animal models along with

the chronic accumulation of fat in humans compared to diet-induced steatosis in animal experiments. Although the increase in perfusate TGs indicates that active metabolism occurred in the grafts. Moreover, normothermic machine perfusion may be a potential starting point for liver defatting if combined with pharmacological cocktails.

Boteon and colleagues conducted an experiment using normothermic machine perfusion supplemented with a previously described defatting cocktail [12] to assess intracellular lipid reduction in ten discarded livers [38]. Compared to a control group perfused with vehicle only, the five livers perfused with defatting agents for six hours had a reduction in tissue TGs and macrosteatosis by 38% and 40%, respectively. The team also saw increased beta-oxidation with higher ATP production along with enhanced viability markers such as urea production, bile production, and lowered injury enzymes. With such promising results, one could get excited about the future of pre-transplant liver treatment. Even though the same group has previously demonstrated minimal cytotoxicity of these drugs directly on cholangiocytes and intrahepatic endothelial cells [16], the systemic effects have not yet been investigated in humans. More extensive trials would need to take place to establish drug safety before this defatting strategy could be implemented in patients. In Table 3, perfusion methods in clinical trials have been summarized.

**Table 3.** Summary of Machine Perfusion Techniques–Clinical Trials.


#### **4. Genomic Approaches for Defatting Strategies**

Hepatic steatosis is the result of dysfunction between the four major pathways regulating lipid metabolism in the liver. These pathways include the uptake of fatty acids (FAs), de novo lipogenesis, beta-oxidation, and transport out of hepatocytes [5]. Abnormal levels of proteins associated with both the uptake and transport of lipids and de novo lipogenesis have been documented in patients with NAFLD, and these provide a potential target for treatment. Specifically, this section will focus on recent studies that have incorporated genomic approaches to influence the processes involved in intracellular lipid accumulation.

To begin with hepatocyte lipid transport, FAs are primarily moved into the cells by transporters with diffusion playing a much smaller role [39]. The main proteins involved in this process include fatty acid transport proteins (FATP), cluster of differentiation 36 (CD36), and caveolins, all located in the plasma membrane. Out of six possible FATPs, FATP2 and FATP5 are primarily found in the liver and have been implicated in the pathogenesis of NAFLD [40]. Studies have shown that both FATP2 and FATP5 are expressed at higher levels in patients with NAFLD that progresses to NASH [5].

Falcon et al. conducted a study to characterize the role of FATP2 in lipid transport [41]. By using an adeno-associated virus (AAV)-based knockdown strategy, they were able to inhibit the function of FATP2 in vivo using mouse models. One week after injection with the

viruses, a 40% decrease in intracellular lipid uptake was observed in the FATP2 knockdown mice compared to control. The team also found that knockdown of FATP2 resulted in lowered liver TGs but did not affect hepatic free and esterified cholesterol levels. Liver injury enzymes, liver histology, and feeding behavior were also not affected by inhibition of FATP2, indicating this as a promising route for further investigation. Additionally, early studies also using AAV-based strategies showed that knockout of FATP5 also significantly reduces FA uptake in hepatocytes, TG content, and reverses steatosis [42,43].

CD36 is also found at higher levels in patients with NAFLD. CD36 is a translocase protein that aids in the transport of long-chain FAs and is regulated by PPAR-γ, pregnane X receptor, and liver X receptor [44]. Wilson et al. determined the role of CD36 in the pathogenesis of NAFLD [45]. They found that deletion of the *Cd36* gene in mice fed with a high-fat diet resulted in reduced liver lipid content and hepatocyte FA uptake. Additionally, the mice had improved whole-body insulin sensitivity and reduced liver inflammatory markers, making this protein an excellent target for NAFLD gene therapy.

The caveolin protein family consists of three members, termed caveolins-1, 2, and 3, found in the plasma membrane and whose function is to facilitate protein trafficking and lipid droplet formation [40]. Early studies found that caveolin-1 levels were increased in mice fed a steatosis-inducing high-fat diet for 14 weeks, indicating that caveolins may be implicated in the pathogenesis of NAFLD [46]. In contrast, recently, Li et al. demonstrated that wild-type mice fed a high-fat diet with NAFLD had markedly reduced expression of the caveolin-1 gene [47]. Similarly, mice with caveolin-1 knockdown had augmented steatosis, increased plasma cholesterol, and elevated liver injury enzymes, whereas overexpression of the gene resulted in significantly attenuated lipid accumulation in hepatocytes. Another study aimed to determine the protective mechanism behind caveolins in NAFLD through both in vitro and in vivo methods [48]. Taken together, this study revealed decreased levels of caveolins and autophagy-related proteins when exposed to high levels of FAs for an extended period of time. They further concluded that the inhibition of Akt/mTOR pathway was involved in the protective role of caveolin-1 in autophagy and lipid metabolism in NAFLD.

Fatty acid binding protein (FABP) is another lipid transport protein that is found intracellularly. Following passage through the plasma membrane, lipids are not allowed to travel freely through the cytosol; instead, FABPs shuttle them between different organelles where various metabolic processes take place. Specifically, FABP1 is the predominant isoform found in the liver [40]. It is thought that FABP1 is cytoprotective due to its ability to facilitate the oxidation or TG incorporation of potentially lipotoxic FAs that accumulate in the cytosol [49]. An early study demonstrated that patients with NAFLD had much higher mRNA levels of FABP1 compared to controls [50]. The authors concluded that this was a compensatory mechanism where the cell attempted to store or release excess lipid. However, the enhanced FA trafficking may lead to storage of harmful lipid levels, promoting steatosis. Knocking out FABP1 in mice resulted in decreased hepatic TGs and lipid disposal pathways, as expected [51]. Additionally, a more recent study of FABP1 knockout found that expression of inflammatory and oxidative stress markers, as well as a marker of lipid peroxidation, were also significantly decreased [52]. Findings like these indicate that attenuation of FABP1 production could be a potential route of treatment for NAFLD.

De novo lipogenesis (DNL) enables the liver to synthesize new FAs from acetyl-CoA [5]. The two main enzymes involved are acetyl-CoA carboxylase (ACC) and fatty acid synthase (FASN). Once the new FAs have been synthesized, they must undergo a myriad of modifications before being stored as TGs or exported as VLDL particles [5]. Thus, an increased rate of DNL could easily cause accumulation of lipids inside hepatocytes, leading to steatosis and eventually NAFLD.

Two key transcription factors involved in the regulation of DNL include sterol regulatory element-binding protein1c (SREBP1c) and carbohydrate regulatory element-binding protein (ChREBP) [5]. Beginning with SREBP1c, it is activated by insulin and liver X receptor α [53]. Studies have found elevated levels of SREBP1c in patients with NAFLD [54] along with an expected rise in hepatic TG levels in mice genetically engineered to overexpress the protein [55]. Moreover, SREBP1c knockout mice displayed decreased levels of mRNAs encoding enzymes like ACC and FASN, both critical for DNL [56]. ChREBP, on the other hand, is only activated by carbohydrates and does not facilitate fat-induced lipogenesis, whereas high-fat diets may reduce the activity of ChREBP [5]. Lizuka et al. reported that ChREBP knockout in mice resulted in reduced hepatic TG synthesis by 65%, but insulin resistance, delayed glucose clearance, and intolerance to simple sugars was also observed [57]. This goes to show the importance of ChREBP in both DNL and glucose metabolism. Another experiment demonstrated that ChREBP knockout protected against fructose-induced steatosis in mice while enhancing hepatic damage through increased cholesterol synthesis and resultant cytotoxicity [58]. This finding leads the authors to believe that ChREBP may have a cytoprotective effect by limiting cholesterol toxicity. Thus, increased levels of ChREBP in patients with NAFLD may be due to a defense mechanism to prevent progression to NASH. It has also been postulated that lipogenesis may cause steatosis but prevents disease conversion to NASH [59]. Finally, hepatic overexpression of ChREBP in mice produced steatosis from upregulated DNL, but also maintained insulin sensitivity and glucose tolerance [60]. As detailed, both transcription factors of DNL play important roles in the pathogenesis of NAFLD, however, it was concluded that SREBP1c is the predominant regulator [60]. Patients with NAFLD had lowered levels of ChREBP while SREBP1c was upregulated, causing increased activity of ACC and FASN.

As previously stated, ACC is a crucial enzyme that regulates the rate-limiting step of DNL and is elevated in response to SREBP1c in NAFLD patients. Early studies were conducted with knockout of different isomers of the enzyme. Mao et al. generated liverspecific ACC1 knockout mice where generation of malonyl-CoA, the product of ACC1, was 75% lower compared to control mice, and the livers accumulated 40–70% less TGs after feeding a fat-free diet for 10 days [61]. However, the synthesis of lipogenic enzymes was increased in ACC1 knockout livers, possibly due to overexpression of ACC2. Another study demonstrated that inhibition of both ACC1 and ACC2 achieved reversal of hepatic steatosis, reduced malonyl-CoA levels, and improved insulin sensitivity by increasing the rate of beta-oxidation [62]. Very recently, a great deal of work has been done to create small molecule inhibitors of ACC1 and 2. Matsumoto et al. used GS-0976 (fircosostat) in mice with diet-induced steatosis to inhibit both isoforms which resulted in significantly reduced TG hepatocyte content, histologically [63]. The small molecule also reduced the areas of hepatic fibrosis and treated high levels of liver injury markers, indicating a potential use in NASH as well as NAFLD. Another novel small molecule inhibitor specific for ACC1, called compound 1, has been investigated for its potential role in the treatment of NAFLD [64]. It has been shown that dual inhibition of ACC1 and 2 causes increase in plasma TG levels, which could be harmful to some patients [63]. However, selective ACC1 inhibition in mice did not affect plasma TG levels compared to controls while still achieving a significant reduction in hepatic steatosis and fibrosis. The study concluded that this effect is due to preserved activity of ACC2 that compensates for the loss of malonyl-CoA production from ACC1. Overall, these drugs show great promise in the future of NAFLD/NASH treatment.

FASN is another key enzyme involved in DNL whose primary role is converting malonyl-CoA to palmitate [5]. Similar to ACC, studies have also been done to show to effects of modifying the activity of FASN in hepatocytes. A study was conducted to evaluate the effects of FASN knockout in mice fed a zero-fat diet [65]. After prolonged fasting, the mice unexpectedly displayed hypoglycemia, fatty liver, and defects in PPARα expression which is a transcription factor integral to beta oxidation of FAs. Overexpression of FASN did not induce any histological changes in liver; however, complete genetic ablation of FASN resulted in a decline in cell proliferation and a rise in apoptosis [65]. Although inhibition of FASN may improve liver viability by decreasing DNL, it seems to have more important regulatory properties that are disastrous for the cell if interrupted.

Another enzyme involved in liver lipid metabolism is stearoyl-CoA desaturase-1 (SCD1), and it is responsible for conversion of saturated Fas to monounsaturated Fas [5]. This effect is thought to be protective against NAFLD [60]. To corroborate this thought, Li et al. discovered that incubation of hepatocytes with saturated Fas lowered cell viability while incubation with monounsaturated Fas did not affect viability even though there was enhanced lipid accumulation [66]. Within the same study, SCD1 knockout was performed in mouse hepatocytes with diet-induced NASH and resulted in increased fibrosis and cellular apoptosis. Therefore, inhibition of SCD1 could exacerbate NAFLD or NASH due to excessive build-up of cytotoxic lipid species when monounsaturated Fas are not created to be safely stored. Overall, SCD1 proves to be a potential effective target for the treatment of NAFLD progression.

In a recent study, it has been shown that the transcription factor zinc fingers and homeoboxes 2 (ZHX2) also has a role in NASH models [67]. ZHX2 suppressed NASH progression in steatotic hepatic cells and downgraded inflammation and fibrosis in liver. Conversely, knocking out ZHX2 exacerbated NASH progression in animal models which was confirmed by increased lipid accumulation, aggravated inflammation, and increased fibrosis scores in liver. This protective effect is predominantly through activation of the phosphatase and tensin homolog (PTEN) gene by ZHX2.

In addition to the characterized genes and their functions in lipid accumulation in hepatocytes, functional genomics will help us find novel genes for defatting purposes. Recently, Hilgendor et al. used genome wide CRISPR screening for adipocytes and sorted the cells according to their lipid content. They identified candidate adipogenic regulators [68]. Similar experiments will shed light on hepatocyte lipid accumulation and will give us some novel targets to prevent fat accumulation in hepatocytes.

Improvement of techniques and a more affordable cost in next generation sequencing have allowed researchers to accumulate more genetic information from patients. In this regard, genome-wide association studies (GWAS) will identify genes and their associated alterations in the genome (called SNPs) with a trait or disease like NAFLD. Two independent groups used large cohorts (10k and 20k) to find a novel association between NAFLD and target genes in European ancestry [69,70]. PNPLA3 has been characterized as a risk factor for the susceptibility of fat accumulation. Further studies in different geographic locations and with more diversity in race are necessary to conclude the gene-trait interaction. Table 4 summarizes the gene manipulations for liver defatting.


**Table 4.** Summary of Genomic Approaches for Liver Defatting.


**Table 4.** *Cont.*

#### **5. Additional Interventions for Liver Defatting**

Both surgical and medical interventions have been shown to have an effect on people diagnosed with NAFLD. For example, bariatric surgery is an option for patients with obesity-related diseases who are able to commit to long-term medical follow-up. It has been shown that after bariatric surgery, patients have significant improvement in steatosis (resolution of NASH in 80% of patients in first year following surgery), which is related to the improvement in insulin resistance [71,72]. In addition to the resolution of NASH, decreasing fibrosis in the liver could be beneficial for the health of the liver [73]. However, some studies have controversial results regarding improvement in liver fibrosis in NASH patients [72]. In a recent long term follow-up study, NASH resolution in liver samples was observed in 84% of patients 5 years after bariatric surgery, and the reduction in liver fibrosis was most prominent in the first year with continued progression through 5 years [74].

Another option to possibly treat people with NAFLD includes the use of currently available medicines. One of the major drivers of NASH is insulin resistance as well as the dominant characteristics of type 2 diabetes and obesity. In addition to insulin resistance, dysfunction and dysregulation of adipose tissue leads to increased circulating fatty acids, carbohydrates and causes generation of hepatic lipid accumulation, cell injury on hepatocytes, increased inflammation, and liver fibrosis in the long run [75–77]. Glucagonlike peptide 1 (GLP-1) is an intestinal hormone that has a role in regulation of glucose metabolism. It stimulates insulin secretion, proinsulin gene expression and β-cell proliferative and anti-apoptotic pathways, as well as inhibiting glucagon release, gastric emptying, and food intake [78]. GLP-1 receptor agonists are a class of drugs that are beneficial to break the insulin resistance pattern of NAFLD. Liraglutide (one of the FDA approved GLP-1 receptor agonists) has been shown to improve liver function levels and reduce liver fat progression [79]. It has histologically been proved that it is beneficial for NASH resolution at liver tissue [80]. Another GLP-1 receptor agonist, semaglutide, is approved for the treatment of type 2 diabetes [81] and is being studied for use in weight management [82]. Semaglutide has a similar mechanism of action as liraglutide, but the metabolic effects of semaglutide are more robust [83–85]. Besides its effect on control of diabetes and weight loss, it reduces the AST levels and inflammation markers in liver [86]. In a recent randomized, placebo-controlled, phase 2 trial, it has been showed that semaglutide increased the NASH resolution compared to placebo in patients with biopsy-confirmed NASH and liver fibrosis. However, they did not show improvement in liver fibrosis [87].

#### **6. Conclusions**

NAFLD is an ever-growing problem affecting millions of people each day, but there is not currently a recommended treatment to prevent or cure this formidable disease. The increasing number of liver transplants due to NAFLD has led to a huge need for suitable donor livers to be available. In addition, many potential grafts are discarded because they are also too steatotic for successful transplantation. Much research has focused into defatting strategies to treat those living individuals with NAFLD and techniques to generate suitable donor grafts for transplantation. Medical and surgical interventions, targeting specific proteins or identification of novel target genes for defatting purposes, defatting using pharmacologic agents, and machine perfusion of extracted livers are all exciting avenues to explore for future treatment of this disease (Figure 2). However, issues are likely to arise with any method due to the extremely interconnected physiology of the liver where one change could have many effects far from the intended result. Therefore, all possibilities must be taken into careful consideration when developing treatments for NAFLD.

**Figure 2.** Overview of the different approaches for defatting purposes. Red arrow shows the beneficial effect of down regulation of target genes while green arrow demonstrates the positive effect of upregulation on defatting in third panel. Fourth panel summarizes the structure of known chemicals from in vitro defatting experiment. Perfusion of liver on normo- and hypothermic machine has great premise for future therapeutic approaches (Created with BioRender.com).

**Author Contributions:** Conceptualization, E.N.Y. and C.K. (Cem Kuscu); methodology, E.N.Y., M.D. and C.K. (Cem Kuscu); investigation, E.N.Y., M.D.; resources, J.D.E., C.K. (Cem Kuscu); data curation, E.N.Y.; writing—original draft preparation, E.N.Y., C.K. (Cem Kuscu); writing—review and editing; E.N.Y., M.D., C.W., A.B., J.D.E., C.K. (Canan Kuscu), C.K. (Cem Kuscu); visualization, E.N.Y., M.D., and C.K. (Cem Kuscu); supervision, A.B., J.D.E., C.K. (Canan Kuscu), C.K. (Cem Kuscu); project administration, A.B., J.D.E., C.K. (Canan Kuscu), C.K. (Cem Kuscu); funding acquisition, A.B., J.D.E., C.K. (Canan Kuscu) and C.K. (Cem Kuscu). All authors have read and agreed to the published version of the manuscript.

**Funding:** Individuals were supported by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health (NIH) under award number R01DK117183 and DK132230 (AB), institutional startup by the James D. Eason Transplant Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Convolutional Neural Network Model Based on 2D Fingerprint for Bioactivity Prediction**

**Hamza Hentabli 1,2,\*, Billel Bengherbia 1, Faisal Saeed 2,3,\*, Naomie Salim 2, Ibtehal Nafea 4, Abdelmoughni Toubal <sup>1</sup> and Maged Nasser <sup>5</sup>**


**Abstract:** Determining and modeling the possible behaviour and actions of molecules requires investigating the basic structural features and physicochemical properties that determine their behaviour during chemical, physical, biological, and environmental processes. Computational approaches such as machine learning methods are alternatives to predicting the physiochemical properties of molecules based on their structures. However, the limited accuracy and high error rates of such predictions restrict their use. In this paper, a novel technique based on a deep learning convolutional neural network (CNN) for the prediction of chemical compounds' bioactivity is proposed and developed. The molecules are represented in the new matrix format Mol2mat, a molecular matrix representation adapted from the well-known 2D-fingerprint descriptors. To evaluate the performance of the proposed methods, a series of experiments were conducted using two standard datasets, namely the MDL Drug Data Report (MDDR) and Sutherland, datasets comprising 10 homogeneous and 14 heterogeneous activity classes. After analysing the eight fingerprints, all the probable combinations were investigated using the five best descriptors. The results showed that a combination of three fingerprints, ECFP4, EPFP4, and ECFC4, along with a CNN activity prediction process, achieved the highest performance of 98% AUC when compared to the state-of-the-art ML algorithms NaiveB, LSVM, and RBFN.

**Keywords:** activity prediction model; biological activities; bioactive molecules; convolutional neural network; deep learning

#### **1. Introduction**

Extraction of the structural activity relationship (SAR) [1,2] information from chemical datasets relies on the pairwise structural comparison of all toxicophore features and small molecules, which highlights the degree of the structural relationship between the compounds [3–6]. The Quantitative Structure–Activity Relationship (QSAR) can correlate the compound's chemical and structural features with its physicochemical or biological properties. The molecular descriptors are applied for encoding the features, while the QSAR model identifies the mathematical relationship between the descriptors and the biological features or other relevant properties of the known ligands for predicting the unknown ligands. These QSAR studies are able to reduce the failure costs of potential drug molecules, as they easily identify the promising lead molecules and reduce the number of expensive experiments. These are considered important tools in the pharmaceutical industry since they have identified many high-quality leads during the early stages of drug discovery.

**Citation:** Hentabli, H.; Bengherbia, B.; Saeed, F.; Salim, N.; Nafea, I.; Toubal, A.; Nasser, M. Convolutional Neural Network Model Based on 2D Fingerprint for Bioactivity Prediction. *Int. J. Mol. Sci.* **2022**, *23*, 13230. https://doi.org/10.3390/ ijms232113230

Academic Editor: Alexandre G. de Brevern

Received: 14 September 2022 Accepted: 27 October 2022 Published: 30 October 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

A great deal of information is contained in the molecular structure of a compound: For example, it indicates the number of elements or describes its shape and electrostatic field [7,8]. The collection of atoms that constitute a molecule can be symbolically represented in many ways. It is not easy to determine the optimum approach that represents the molecular structure that is suited for all applications [9–11].

Generally, molecules are represented using their molecular or structural formulae and line drawings, which indicate the number of atoms for various elements present in the single molecule of a compound, for example, H2O indicates the presence of two hydrogens and one oxygen atom in a water molecule. In many cases, the molecular formula alone cannot represent the chemical structure. For instance, in isomers, molecules with a similar molecular formula show a different atomic arrangement. The structural formula depicts the molecular structure and represents the individual bonds between all atoms as lines.

Many chemoinformatics methods are based on numerical descriptors that include a description of the molecular structure and properties. These descriptors are used as input data for various statistical and data mining techniques. The other types of property descriptors are generally used in the diversity analysis, selection of the representative compound subsets, combinatorial library design, and QSAR studies. Thus, the fingerprint X of molecule A is represented using a sequence of numbers:

$$\mathcal{X}\_{\Lambda} = \{ \mathfrak{x}\_{1'} \mathfrak{x}\_{2'} \mathfrak{x}\_{3'} \dots \mathfrak{x}\_{n} \}$$

where *x*<sup>i</sup> refers to the i-th structural unit in molecule A, i.e., bonds, atoms, or fragments. The value n represents the length or size of all fingerprints, i.e., the number of molecular properties.

The 2D fingerprint descriptors are also used to provide a rapid screening step during substructure and similarity searches [1,10]. These 2D fingerprints are categorised based on the methods used, for example, the fragment dictionary and hashed methods illustrated in Figure 1. The fingerprints are generated using a fingerprinting process that converts a chemical structure into a binary form (i.e., a string of 0 s and 1 s). The binary form depicts the chemical shorthand, which indicates the presence/absence of the structural features in a molecule.

**Figure 1.** Two examples showing the generation of a molecular fingerprint: (**a**) Dictionary-based fingerprint and (**b**) hashed-based fingerprint.

The molecule-based fingerprints are represented by dividing the molecules into fragments of specific substructures and structural features. In this kind of representation, the fingerprint length is based on the number of fragments present in the dictionary, where every bit position in the binary string is assigned to one particular sub-structural feature in

the dictionary. Thus, the bits can individually or in combination represent the presence or absence of the features [10,12].

The state-of-the-art 2D fingerprint technique used in the present study was based on QSAR, which can predict and measure all biological activities of the compounds. In this study, eight different 2D fingerprints were investigated for bioactivity prediction, which was generated using the PaDEL descriptor software. Here, the 2D fingerprint descriptors were used with the CNN model for predicting the biological activities and studying the combination and the integration of various fingerprints in the CNN architecture. The next sections describe the background and design of the novel technique. The performance of the proposed technique was evaluated after conducting several experiments based on the structure or bioactivity prediction.

#### **2. Results**

The proposed code was implemented in public DL software, Keras [13], based on Theano [14]. The experiments were conducted using the Dell Precision T1700 CPU system with 16 GB memory and the professional-grade NVIDIA GeForce GTX 1060 6 GB graphics.

The proposed novel CNN model for predicting the molecular bioactivities was a ligand-based activity prediction or target-fishing technique that could be used for unknown chemical compounds. It was a deep learning system consisting of an adapted molecular matrix representation, "Mol2mat", which incorporated all the substructural data on the molecules based on their fingerprint features for predicting their activities. This proposed CNN method was then compared to three different ML algorithms described in the WEKA-Workbench, NaiveB, LSVM, and RBFN, using optimal parameters obtained from previous work using the same datasets [15], as previously explained in Section 4.4.

We also determined the computing prediction accuracy of this deep learning system by applying the technique described in Section 4.2, using eight fingerprint representatives. The results derived from these fingerprints were then compared using the Analysis of variance (ANOVA) technique as a significance test and a violin-plot with boxplot charts. The five fingerprint representatives that showed the best CNN configuration were further chosen as the best representatives. This encompassed Stage 1 of the analysis and is described in detail below. In Stage 2, these five representatives were assessed using all probable combinations, such as 2, 3, 4, or 5. The results acquired from Stage 2 were further compared using their violin-plot charts, and the best fingerprint combination was noted. Stage 2 is described in more detail below. In Stage 3, all results were compared for the best combination derived from the previous stages with three known ML algorithms, NaiveB, LSVM, and RBFN. The proposed CNN model in this paper will be henceforth referred to as CNNfp.

#### *2.1. Benchmarking*

The proposed technique was evaluated by comparing it with three other machine learning methods using WEKA-Workbench [16] methods, including a Naive Bayesian classifier (NaiveB) [17], LibSVM [18], and a Radial basis function network (RBFN) [19]. Finding the best values for the classifier's parameters is a difficult task. However, the best probable setup for the LSVM classifier was identified by the WEKA-Workbench. In this paper, the linear kernel was used for SVM, and the values of 0.1, 1.0, and 0.001 were used for the Gamma, Cost, and Epsilon parameters, respectively. For the NaiveB classifier, a supervised discretisation technique was used to convert the numeric attributes to the nominal attributes, while the minimal standard deviation limit for the RBFN classifier was 0.01. All the remaining parameters of the classifiers used the default values in the WEKA-Workbench.

#### *2.2. Stage 1*

In this stage, the prediction accuracies of the 24 activity classes present in an MDDR1, MDDR2, and Sutherland datasets were determined and compared using eight fingerprint

representatives. Figure 2 summarises the CNN configuration, which used the Mol2mat molecular representation.

**Figure 2.** A summary of the proposed CNN configuration that uses the Mol2Mat representation.

In Stage 1, the eight fingerprints described above were studied based on two parameters. The first parameter included the accuracy response vs. the number of iterations, while the second parameter included the MSE response vs. the number of epochs. These were studied in a 2D graph consisting of the training data results.

Figure 3a presents a graphical result for the number of iterations vs. the accuracy. It also presents eight lines of the different fingerprints. The ECFC4 fingerprint displayed a speed augmentation in their prediction accuracy from the third epoch, whereas the EPFP4 fingerprint showed better accuracy in 17 epochs. However, the AlogP and the MDL fingerprints displayed the lowest prediction accuracy values. The mean squared error or loss value showed similar results to the accuracy performance, as shown in Figure 3b. The novel CNN model could accurately predict biological activities with an average MSE value of 0.0054 for ECFC4 and 0.002 for the ECFP4 fingerprints.

Figure 4 shows the comparison of the prediction accuracy values for Stage 1 experiments that were conducted using the CNN model for eight fingerprint representatives using the violin-plot charts. The construction of violin-plot charts is shown on the right-hand side of this figure.

**Figure 3.** Evaluation of eight fingerprints based on their (**a**) accuracy and (**b**) MSE performance.

**Figure 4.** Prediction accuracy values of the CNN model for the eight fingerprint representatives using the violin-plot charts.

The violin-plot charts are able to remove the conventional boxplot elements and plot each activity class as a single point. Figure 2 indicates that the eight fingerprint representatives showed a clear difference in their average prediction accuracy values. The ECFC4 showed the best average accuracy of 90.17. The graph fingerprint came next with a value of 74.84, closely followed by the CDKFp and ECFP4 fingerprints, which showed similar average accuracy values of 72.28 and 71.97, respectively. The worst average accuracy values were displayed by PubChem (53.88), MDL 26.25, and AlogP, with an accuracy value of only 22.45. Using these results, and based on the ANOVA significant test results, a small *p*-value of 0.04 was noted, which highlighted the difference between all the fingerprints.

Furthermore, the AlogP, MDL, and PubChem fingerprints were regarded as the worst contenders as they showed a higher variance between all the biological activity classes. Thus, CDK, ECFP4, ECFC4, EPFP4, and graph were some of the best fingerprints and could be forwarded to Stage 2 to improve all the results based on the probable combination cases of two, three, four, or five of the best fingerprints. The combinations were based on the fusion of the extracted feature levels.

In this stage, we used better techniques to combine the various sources of knowledge available in the area of deep learning [20–22]. Firstly, we proposed a feature extraction step to present each selected molecular fingerprint. This combination significantly improved the models, since they could benefit from every molecular fingerprint and combine all the extracted features from various sources after a flattened layer, which followed the max-pooling layer. This helped them convert the 2D matrix data into the vector. As a result, they could process the output data using the fully connected layers, known as the dense layers. This section described the CNN architecture utilised in this study and how many CNN architectures can be combined into a single model. The next section will describe the performance evaluation.

#### *2.3. Stage 2*

In this stage, the prediction accuracies for the different combination cases of the five fingerprint representatives were determined. Table 1 presents 26 possible combinations for these five fingerprints, including combinations of two, three, four, and five combinations of the CDK, ECFP4, ECFC4, EPFP4, and graph fingerprints. Henceforth, each combination case will be based on its name (A–Z), and each row will represent one combination case. Case A consists of two combinations, while Case Z consists of five fingerprint combinations.



The colors are used to differentiate between each level. Combination of 2 blue; Combination of 3 orange; Combination of 4 yellow; Combination of 2 green.

The 26 combinations of the five fingerprints were investigated, as shown in Table 1. Figures 5 and 6 summarise the CNN configuration for the combination case between the CDK, ECFP4, and EPFP4 fingerprints, referred to as "K", as an example using the Mol2mat molecular representation. As seen in both figures, the model has three branches, with a matrix (32 × 32) as the input and two Conv. layers and max-pooling for each branch concatenate layer to merge all extracted features into one array. Finally, there are two hidden layers with 256 and 128 neurons and an output layer with 10 outputs. Rectified linear activation functions are used in each hidden layer, and a SoftMax activation function is used in the output layer.


**Figure 5.** A summary of the CNN configuration for a combination case named "K" using a Mol2mat representation.

**Figure 6.** A CNN Model configuration for a combination case named "K" using the Mol2mat representation.

Figure 7 compares the prediction accuracy values for the Stage 2 experiments for all 26 combination cases with the help of violin-plot charts.

**Figure 7.** Prediction accuracy values for the CNN model were applied to the 26 combination cases of the five best fingerprints with the help of violin-plot charts.

The results in Figure 7 show a *p*-value of 0.031 based on the ANOVA significance test results, indicating that the difference between all the combination cases is significant. The violin-plot charts plotted each activity class as the point. It was seen that the D, O, R, and T combination cases displayed the highest prediction accuracy, >80%, and a low variance amongst all the activity classes. The combination cases were plotted in different boxplot charts to determine the distribution of the activity classes based on the low- and highdiversity values noted for each activity class. Figure 8 compares the prediction accuracies for all experiments in Stage 2 for the D, O, R, and T combination cases, which were plotted using the Boxplot charts.

**Figure 8.** A comparison of the prediction accuracies for the D, O, R, and T combination cases, plotted using the boxplot charts.

Based on the violin-plot charts presented in Figure 7 and the Boxplot chart shown in Figure 8, a *p*-value of 0.048 was calculated based on the ANOVA significance test results. This indicated the significance of the difference between all the models. The R combination displayed the best average prediction accuracy of 99.17, indicating that a combination of the three fingerprints, ECFP4, EPFP4, and ECFC4, showed good performance compared to the other combinations.

The R combination also showed a lower variance of 5.52 compared to the other cases. Furthermore, this combination showed higher stability even when placed in a high- or low-diversity class. Meanwhile, the D, O, and T combinations displayed a mean prediction accuracy of 97.45, 97.03, and 97.72, respectively. They also displayed higher variance than the R combination. These combination cases showed a variance prediction accuracy of 12.62, 17.97, and 10.81, respectively, indicating that R was the best fingerprint combination seen in Stage 2.

#### *2.4. Stage 3*

In Stage 3, the authors compared the results for the best combination of ECFP4, EPFP4, and ECFC4, as established in Stage 2, with those obtained from the standard ML algorithms existing in a WEKA-Workbench: NaiveB, LSVM, and RBFN.

Tables 2–4 show the sensitivity, specificity, and AUC values for all the datasets used here. A visual inspection of all tables could be used to compare the performance of the prediction accuracies of all four algorithms. However, the authors applied a quantitative boxplot chart to compare these algorithms. This process quantifies the agreement level between all the multiple sets and ranks the different objects.


**Table 2.** Sensitivity, specificity, and AUC values for all the prediction models using an MDDR1 dataset.

**Table 3.** Sensitivity, specificity, and AUC values for the prediction models using an MDDR2 dataset.


**Table 4.** Sensitivity, specificity, and AUC values for the prediction models using a Sutherland dataset.


Boxplot charts were used to assess the performance of a set of fingerprints, ECFP4, EPFP4, and ECFC4, using three algorithms (RBFN, NaiveB, and LSVM).

Here, MDDR1, MDDR2, and the Sutherland datasets, with their activity classes described in Tables 5–7, were regarded as judges. In contrast, parameters such as sensitivity, specificity, and AUC, measured using different prediction algorithms, were regarded as objects. The outputs of this test included *p*-value, median, and variance. Figure 9 shows the results of the boxplot chart, where the sensitivity values of the six algorithms were compared. The results show a *p*-value of 0.008 based on the ANOVA significance test results, which revealed a significant difference between all algorithms. The CNNfp algorithm showed a high sensitivity of 0.985, while the NaiveB and LSVM ML algorithms showed a high variance of 0.15 and 0.23, respectively, compared to the CNNfp. Diversity in all sensitivity values was especially seen in the algorithms that displayed a variance of 10−4. Furthermore, these models showed a mean sensitivity of 0.90 and 0.74, respectively.


**Table 5.** MDDR activity classes for DS1 dataset.

**Table 6.** MDDR activity classes for DS2 dataset.


**Table 7.** Sutherland activity classes.


**Figure 9.** Boxplot chart results based on comparing the sensitivity values of different algorithms: CNNfp, NaiveB, RBFN, and LSVM.

Figure 10 shows the boxplot chart results after comparing the specificity values of the CNNfp, NaiveB, RBFN, and LSVM algorithms. The NaiveB and RBFN ML algorithms showed a higher variance of 0.01 and 0.04, respectively, compared to the CNNfp. This diversity in all specificity values was especially seen in the algorithms that displayed a variance of 2.5 × <sup>10</sup><sup>−</sup>5. Furthermore, the CNNfp algorithm showed a high specificity value of 1.0, whereas the NaiveB and the RBFN algorithms displayed average specificity values of 0.99 and 0.98, respectively. The results showed a small *<sup>p</sup>*-value of 3.5 × <sup>10</sup><sup>−</sup>5, highlighting a significant difference between all algorithms.

**Figure 10.** Boxplot chart results based on comparing the specificity values of different algorithms: CNNfp, NaiveB, RBFN, and LSVM.

Figure 11 describes the Boxplot chart results after comparing the AUC values of the CNNfp, NaiveB, RBFN, and LSVM algorithms. The LSVM, NaiveB, and RBFN ML algorithms showed a higher variance of 0.125, 0.083, and 0.033, respectively, compared to CNNfp. This diversity in all AUC values was especially seen in the algorithms that displayed a variance of 4.13 × <sup>10</sup>−5. A combination of the Mol2mat with the CNNfp algorithm showed an AUC value of 0.99, whereas the LSVM, NaiveB, and RBFN algorithms displayed higher average AUC values of 0.96, 0.99, and 0.85, respectively. The results showed a *<sup>p</sup>*-value of 4.2 × <sup>10</sup><sup>−</sup>3, highlighting a significant difference between all algorithms.

**Figure 11.** Boxplot chart results based on the comparison of the AUC values of different algorithms: CNNfp, NaiveB, RBFN, and LSVM.

The boxplot chart results (Figures 9–11) showed that the use of CNNfp was very efficient and convenient and presented less severe outliers in comparison to the NaiveB, RBFN, and LSVM algorithms, thereby indicating the effectiveness of this prediction approach. The results presented in Tables 2–4 for all three datasets show that the combination of ECFP4, EPFP4, and ECFC4 fingerprints with a CNN activity prediction method resulted in the lowest variance for the sensitivity, specificity, and AUC values for all activity classes compared to the traditional NaiveB, RBFN, and LSVM algorithms. These results suggest that a deep learning technique could be a promising, novel, and effective method of predicting the activities of a range of chemical compounds.

#### **3. Discussion**

#### *3.1. Similarity Searching*

Comparing unknown molecules to known chemical compounds allows us to predict the activities of targets that are unknown compounds. Thus, the target compounds will exhibit the activities of similar compounds. Several successful target prediction techniques have been proposed in the literature [11,23,24]. For example, the authors in [25] implemented a method for activity prediction using the Multi-level Neighbourhoods of Atoms (MNA) structural descriptor. This descriptor is generated based on the connection table and the table of atoms that represent each compound. A specific integer number is given to each descriptor according to its dictionary. The Tanimoto coefficient was effectively used to calculate the molecular similarity. The target compound activities were then predicted based on the activities of the most similar known compound.

A number of machine learning techniques have been used for activity prediction (target), including Binary Kernel Discrimination (BKD), Naive Bayesian Classifier (NBC), Artificial Neural Networks (ANN), and Support Vector Machines (SVM). The authors of [26] predicted five different ion channel targets using BKD and two different types of activity data. They found that the effectiveness of the model increased using highly similar activity classes. However, if this similarity was too low, the models would not work. As it is simple to build a network to include many sources of significant information about molecular structure, the authors of [27] used data fusion to aggregate the results of BIN searches using multiple reference structures. The authors in [28] presented a new classifier of Kinase Inhibitors using the NBC model. One advantage of this method that was noted is finding compounds that are structurally unrelated to known actives or novel targets for which there are inadequate data to develop a specific kinase model. In [29], the authors summarised how networks could conduct the equivalent of discriminant and regression analyses and underlined how initial overtraining and overfitting could lead to poor prediction performance. According to their predictions, the next revolution in QSAR will focus on developing better descriptors for connecting chemical structure to biological activity. The authors of [30] created a set of SVM classifiers that collectively account for 100 different forms of drug molecule action.

In their study, the multilabel-predicted chemical activity profiling was successfully accomplished by SVM classifiers, and they suggest that the proposed approach can forecast the biological activities of unidentified chemicals or signal negative consequences of drug candidates. In [11,31], the Bayesian belief network classifier was applied to predict the compound's target activities. The authors applied a novel technique to extend previous work, based on a convolutional neural network that uses the 2D fingerprint representation to predict the possibly bioactive molecules. The proposed CNN model for activity prediction also included the substructural information of the molecule.

#### *3.2. Convolutional Neural Network for Biological Activity Prediction*

In [32], the authors used Merck's drug discovery datasets and showed that Deep Neural networks (DNN) could obtain better prospective predictions than the existing machine learning methods. In addition, The Multi-Task Deep Neural Network (MT-DNN) model [33] demonstrated good performance by training the neural network with a number of output neurons, where the input molecule's activity is predicted by every neuron using different assays. In addition, [34–36] demonstrated how MT-DNN may be scaled to incorporate big databases such as PubChem Bioassays [37] and ChEMBL [38].

However, several issues and limitations still exist with the current methods. For instance, these methods work with targets that already have more available data and, thus, they cannot predict novel targets. Additionally, the current DL approaches rely on fingerprints, such as ECFP [39], which limit feature discovery to the composition of the particular chemical structures identified by the fingerprinting process [10,34,40]. This reduces their ability to discover arbitrary features. Moreover, the existing DL methods are blind to the target, as they are not able to elucidate the potential molecular interactions.

Another commonly used method is applying the similarity principle [41], which claims that substances with similar structures have similar biological characteristics. However, the authors in [42] discovered that it frequently fails because minor structural modifications can diminish the ligand's pharmacological activities that describe the molecular similarity within the substructures.

In order to address these issues and limitations, a novel Convolutional Neural Network (CNN)-based model using a 2D Fingerprint was proposed in this study for bioactivity prediction. This technique can be used for several applications such as bioactivity prediction, molecular searching, molecular classification, and virtual screening. The next section provides a description of how the suggested strategy was developed.

#### **4. Materials and Methods**

This section explains how this model is used for identifying and predicting the bioactivities of chemical compounds. First, we describe how various experimental benchmarks can be built and then utilised for system testing. Next, we discuss the systems for input representation and data encoding and deep convolutional network architecture.

#### *4.1. Data Sets*

The proposed prediction model was experimentally evaluated using multiple datasets. This study used three datasets (Tables 5–7), which were described earlier in [43,44] and used in several studies for validating the ligand-based virtual screening methods [7,11,24,31,45,46].

The datasets used are disparate, including a structurally homogeneous dataset, as shown in Figure 12, and a structurally diverse dataset, as shown in Figure 13 [3].

The original version of the MDDR database includes 707 distinct activity classes. The mean pair-wise similarity (MPS) was then computed for each activity class. The mean pair-wise similarity (MPS) of each set of active molecules was used to estimate the diversity. The mean pairwise similarity (MPS) for 102,000 compounds selected randomly from MDDR was 0.200. Figure 14 presents how the MPS can divide the dataset into high- and lowdiversity active classes, so that the cut-off point between the high- and low-diversity groups is equal to 0.200. This method is briefly explained and demonstrated in [3].

These datasets, MDDR1 and MDDR2, comprise 10 homogeneous and heterogeneous activity classes; the Sutherland dataset comprises four activity classes each. Tables 5–7 list the activity classes, molecules in each class, and diversity between classes. These tables were created using ECFP4 to estimate the mean pairwise Tanimoto similarity across all of the chemical pairs within each class (extended connectivity).

As noted above, the MPS values identify the diversity of activity classes that are used to evaluate the similarity search methods and biological activity prediction. Thus, the MPS values were used to compare the three used databases, as shown in Figure 15.

**Figure 12.** Examples of low-diversity molecules in the MDDR dataset.

**Figure 13.** Examples of high-diversity molecules in the MDDR dataset.

**Figure 14.** The average pairwise similarity (MPS) across each set of active molecules.

Box plots are the chart type that is used to visually present the distribution of all numerical data based on their average values and quartiles (or percentiles). Generally, box plots are applied in descriptive statistics since they help in overviewing the set of distributed data along with its range. The right-hand side of Figure 15 depicts the creation of a box, while the median MPS value is represented by the medium segment in the box. The first and third quartiles' MPS values are shown in the lower quartile and the upper quartile, while an empty circle represents the outlier.

**Figure 15.** Comparison of MPS values of the three databases using boxplot.

#### *4.2. Input Representation*

One of the major issues affecting chemoinformatics and QSAR applications is the need for good input features. The general graph-based storage format for chemical compounds' numerical properties can be calculated using a variety of techniques. Fingerprints are a specific type of complex descriptor that detects the feature distribution from the bit string representations [3]. However, a feature extraction step was necessary to analyse the data in the machine learning technique. The performance of all learning algorithms is enhanced by this stage, which aids in expressing the interpretable data in the machines. Even the best algorithms may perform poorly if the wrong features are used, while simple techniques also perform well if suitable features are applied. Feature extraction techniques can be unsupervised or manually conducted. Here, the authors have presented a new molecular representation, Mol2mat (molecule to matrix), used to reshape each fingerprint molecule representation into a 2D array malleable for use in deep learning architecture.

In this study, the authors investigated eight different 2D fingerprints that were generated using Scitegics Pipeline Pilot software [47]. These included the 120-bit ALOGP, 1024-bit CDK (CDKFP), 1024-bit Path Fingerprints (EPFP4), 1024-bit ECFP4, 1024-bit ECFC4, 1024 bit Graph-Only Fingerprints (GOFP), 881-bit PubChem Fingerprints (PCFP), and the 166-bit Molecular Design Limited (MDL) fingerprints. Table 8 describes the storage of the fingerprint representatives for every molecule in a 2D array, with the help of the row-major order, and also describes every matrix representation Mol2mat size for each fingerprint.


**Table 8.** Details of every matrix size for every fingerprint.

To show the difference between different 2D fingerprint representations used in this paper, the authors plotted the scatter graphs in Figure 16 using 5083 molecules (from the MDDR dataset) that are grouped into ten activity classes. These scatter plots were used to establish the relationships between the various compounds belonging to the same class. The molecules were represented by different individual 2D fingerprints and descriptors. In addition, to represent their features, the representation was reduced to a 3D structure using the Principal Component Analysis (PCA) method.

**Figure 16.** 3D−scatter plots based on seven fingerprints and representations of descriptors: (**a**) ALogP, (**b**) CDKFp, (**c**) ECFP4, (**d**) EPFP4, (**e**) GraphOnly, (**f**) MDL, and (**g**) PubchemFp of 5083 different molecules that were selected from the 10 biological activity classes of the MDDR dataset.

As shown in Figure 16, the ECFP4 2D fingerprint representation can be easily observed and was not overlapping. In addition, the molecules' biological activities can be segregated. This shows that the suggested 2D fingerprint representation may be successfully used for predicting the biological activity of various chemical substances.

After the generation of the eight fingerprints, the molecular fingerprints were stored in a 2D array using the row-major order, as shown in Algorithm 1.

Algorithm 1 is A summary of the storage of the fingerprints in a 2D array to yield the Mol2mat presentation.


Algorithm 1 summarises the storage of the fingerprint in a 2D array using the rowmajor order in pseudo-code form. The algorithm's output was a 2D array of Mol2mat representations of the input molecule. Figure 17 summarises the design of the Mol2mat presentation process.

**Figure 17.** A summary of the newly proposed Mol2mat presentation process.

After evaluating each fingerprint, the authors assessed all the probable combinations based on the five best descriptors. The combinations were based on the fusion of the extracted feature levels. The combination of multi-CNN can be performed as illustrated in [48,49]. Initially, the combination cases for 2, 3, 4, and 5 were generated by selecting two fingerprints, then three, followed by four, and finally, all five. Thereafter, the best combination was chosen.

#### *4.3. Convolutional Neural Network*

The default architecture was seen to be a convolutional architecture with fully connected layers. The authors used the Krizhevsky principles [50] for designing the CNN model configuration that was used for viewing the source code [51]. This configuration followed the earlier generic design [50]. Figure 18 presents the general CNN configuration, where the image was passed through the stack of convolutional (conv.) layers. The convolution step employed a max-pooling layer. It was observed that this combination improved the accuracy model and enhanced the CNN configuration.

**Figure 18.** The general CNN configuration.

The flattened layer came after the max-pooling layer. This transformed the 2D matrix data into a single vector, assisting in processing the output that had dense layers, i.e., fully connected layers. The final layer was made of the classification Softmax layer [52,53].

Although CNN displayed good results for the feature learning and the prediction tasks, recent studies have shown a better performance by fusing different CNNs [20,21,54,55]. These combinations can be implemented using feature concatenation or by computing the average or output prediction scores derived from various CNNs.

Some studies [48,49] described the combination of 3 CNN models, as shown in Figure 19. These models were based on the fusion of the information level. Fusion could be performed early in the network after modifying the 1st-layer convolution filters for an extension of time, or it could be performed later, after placing 2 different single-frame networks and then fusing their outputs after the processing. The yellow, green, red, and blue boxes depict the fully connected, normalisation, convolution, and pooling layers, respectively. In a Slow Fusion model, the highlighted columns share the parameters.

In this stage, we used better techniques to combine the various sources of knowledge available in the area of deep learning [20–22]. Firstly, we proposed a feature extraction step for presenting every selected molecular fingerprint. This combination significantly improved the models, since they could benefit from every molecular fingerprint and then combine all the extracted features from various sources after a flattened layer, which followed the max-pooling layer. This helped them convert the 2D matrix data into the vector. As a result, they could process the output data using the fully connected layers, called the dense layers. In this section, we described the CNN architecture used in this research and how we can combine multi CNNs in one model. In the next section, we will describe the performance evaluation.

**Figure 19.** Different approaches used for fusing the information present in the CNN layers.

#### *4.4. Network Architecture*

As mentioned above, eight fingerprint representatives were generated using the Scitegics Pipeline Pilot software [47]. They were further stored in the 2D array with a row-major order for deriving a novel matrix representation Mol2mat, which used the above-mentioned algorithm.

As previously stated, a few fingerprints complemented one another, and their combination yielded good results. This indicated that different fingerprints generated differing results with regard to biological activity prediction or similarity searches. This further indicated that the different QSAR models could be developed based on different fingerprints with similar accuracy. Currently, researchers tend to combine and merge all fingerprints and descriptor sets, which comprise various types of fingerprints [3]. After evaluating each fingerprint, the authors assessed all the probable combinations based on the five best descriptors. The combinations were based on the fusion of the extracted feature levels.

In the present study, we used better techniques for combining the various sources of knowledge available in the area of deep learning [20–22]. Firstly, we proposed a feature extraction step for presenting every best molecular fingerprint in which all molecules were passed through 2 conv. layers, using a (3 × 3) feature map size for convolution and one max-pooling layer. This combination significantly improved the models since they could benefit from every molecular fingerprint and combine all the extracted features from various sources after a flattened layer. As a result, they could process the output data using the fully connected layers. The first two fully connected layers were built using a different number of nodes in every combination. Table 9 presents these node numbers in detail in every combination. The combination cases for 2, 3, 4, and 5 were generated by selecting two fingerprints, then three, followed by four, and finally, all five. The best combination was then chosen.


**Table 9.** Details of the first and second fully connected layers for every combination.

The final layer included the Softmax layer [50,52,53]. Figure 20 describes the configuration of the combined CNN, which was used to assess 3 fingerprints.

The target was as follows: to predict if the specific chemical compound, *i*, showed activity for target, *t*. These data could be encoded in the binary form, yit, where yit = 1 for the active compound and yit = 0 for the inactive compound. This also included the prediction of the compound's behaviour from targets, simultaneously. In the training stage, a general back-propagation algorithm was used to determine the CNN and decrease the cross-entropy of all targets and the activation of the output layer.

**Figure 20.** The configuration of the combined CNN that was used for 3 fingerprints.

#### **5. Conclusions**

This study has investigated the use of molecular fingerprinting in the Convolution Neural Network model to predict the activities of ligand-based targets. The results indicate that the combination of the ECFP4, EPFP4, and ECFC4 fingerprints with a CNN activity prediction method produced the lowest variance for the sensitivity, specificity, and AUC values for all the activity classes, when compared to the three traditional ML algorithms of NaiveB, LSVM, and RBFN, available in the WEKA-Workbench. The paper described a novel Mol2mat process, which showed low overlap and was able to segregate all the biological activities of the chemical compounds. A combination of three fingerprints with CNN was used on some popular datasets, and the performance of this combination was compared to that of three traditional ML algorithms. The proposed algorithm achieved good prediction rates (where the low- and high-diversity datasets displayed a 98% AUC value). The results also showed that combining the ECFP4, EPFP4, and ECFC4 fingerprints with CNN improved the performance of both the heterogeneous and homogeneous datasets. In this study, the authors have shown that this combination of fingerprints with the CNN technique is a convenient and stable prediction process, which could be used for determining the activities of unknown chemical compounds. However, this field needs to be investigated further, and better accuracy prediction processes must be developed for high-diversity activity compounds.

**Author Contributions:** Conceptualization, H.H. and N.S.; methodology, H.H., F.S. and N.S.; software, H.H.; validation, B.B., I.N., A.T. and M.N.; formal analysis, H.H., B.B., I.N., A.T. and M.N.; investigation, H.H., B.B., I.N., A.T. and M.N.; resources, H.H., F.S. and M.N.; data curation, H.H., B.B. and A.T.; writing—original draft preparation, H.H., N.S.; writing—review and editing, H.H., F.S, N.S, I.N. and M.N.; visualization, H.H.; supervision, N.S. and F.S.; project administration, N.S. and F.S.; funding acquisition, F.S., N.S. and I.N. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Research Management Center at Universiti Teknologi Malaysia (Vot No: Q.J130000.21A6.00P48) and Ministry of Higher Education, Malaysia (JPT(BKPI) 1000/016/018/25(58)) through Malaysia Big Data Research Excellence Consortium (BiDaREC) (Vot No: R.J130000.7851.4L933), (Vot No: R.J130000.7851.4L942), (Vot No: R.J130000.7851.4L938), and (Vot No: R.J130000.7851.4L936). We are also grateful to (Project No: KHAS-KKP/2021/FTMK/C00003) and (Project No: KKP002-2021) for their financial support of this study.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The MDL Drug Data Report (MDDR) dataset is owned by www. accelrys.com, accessed on 15 January 2020. A license is required to access the data.

**Acknowledgments:** The authors would like to thank the Research Management Center at Universiti Teknologi Malaysia for funding this research using (Vot No: Q.J130000.21A6.00P48) and Ministry of Higher Education, Malaysia (JPT(BKPI)1000/016/018/25(58)) through Malaysia Big Data Research Excellence Consortium (BiDaREC) (Vot No: R.J130000.7851.4L933), (Vot No: R.J130000.7851.4L942), (Vot No: R.J130000.7851.4L938), and (Vot No: R.J130000.7851.4L936). We are also grateful to (Project No: KHAS-KKP/2021/FTMK/C00003) and (Project No: KKP002-2021) for their financial support of this study.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

