**Peptides for Health Benefits 2019**

Edited by

Blanca Hernández-Ledesma and Cristina Martínez-Villaluenga

Printed Edition of the Special Issue Published in *International Journal of Molecular Sciences*

www.mdpi.com/journal/ijms

## **Peptides for Health Benefits 2019**

## **Peptides for Health Benefits 2019 Volume 2**

Special Issue Editors **Blanca Hern´andez-Ledesma Cristina Mart´ınez-Villaluenga**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Special Issue Editors* Blanca Hern´andez-Ledesma Institute of Food Science Research (CIAL, CSIC-UAM, CEI UAM+CSIC) Spain

Cristina Mart´ınez-Villaluenga Institute of Food Science, Technology and Nutrition (ICTAN, CSIC) Spain

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *International Journal of Molecular Sciences* (ISSN 1422-0067) (available at: https://www.mdpi.com/ journal/ijms/special issues/Peptides 2019).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Article Number*, Page Range.

**Volume 2 ISBN 978-3-03936-084-0 (Hbk) ISBN 978-3-03936-085-7 (PDF)** **Volume 1-2 ISBN 978-3-03936-082-6 (Hbk) ISBN 978-3-03936-083-3 (PDF)**

c 2020 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Special Issue Editors**

**Blanca Hern´andez-Ledesma**, Ph.D. earned a B.S. in Pharmacy in 1998, and defended her Ph.D. thesis in Pharmacy in 2002. Her research career has focused on the biological activity of food proteins/peptides, aiming to better understand their health implications and the development of novel food ingredients. She is author of 76 JCR articles, 9 popular science articles, and 30 book chapters, with an h-index 33 (WoS). Her results have been presented in 78 international and national conferences. She has supervised 4 Doctoral theses and 14 Master theses. She has participated in more than 30 international and national research projects. She has participated as a member of the Selection Board for Tenured Scientists, Ph.D. and Masters' thesis dissertation committees, reviewer of international Ph.D. theses, and member of national and international projects evaluation panels. She is member of the Editorial Committees of 3 books and 8 journals, and collaborates as a reviewer for more than 90 journals.

**Cristina Mart´ınez-Villaluenga** (Ph.D.), B.S. in Biology by University Complutense of Madrid in 2001, Ph.D. in Food Science from the University Autonoma of Madrid in 2006. She joined the Spanish Research Council (CSIC) in 2009. The long-term goal of Dr. Martinez's research program is to enhance the health of individuals by identifying and determining the benefits of the bioactive components of plant foods with special focus on bioactive peptides. Dr. Mart´ınez's research on legumes, cereals, and pseudocereals has led to increased understanding of the anti-inflammatory, anti-hypertensive, anti-diabetic, and other physiological properties of these foods. She is the author of 94 JCR articles and 9 book chapters with an h-index 31 (WoS). Her results have been disseminated in 84 international and national conferences and social media. In the last 10 years, she has supervised a total of 9 Ph.D. theses, 5 Masters' theses, and more than 20 undergraduate students. She has participated in a total of 35 international and national R&D projects and contracts with the agri-food sector. She is the member of the Editorial Committees of 3 books and 3 journals.

## *Article* **In Silico and In Vitro Assessment of Portuguese Oyster (***Crassostrea angulata***) Proteins as Precursor of Bioactive Peptides**

#### **Honey Lyn R. Gomez 1, Jose P. Peralta 1, Lhumen A. Tejano <sup>1</sup> and Yu-Wei Chang 2,\***


Received: 26 September 2019; Accepted: 17 October 2019; Published: 20 October 2019

**Abstract:** In this study, the potential bioactivities of Portuguese oyster (*Crassostrea angulata*) proteins were predicted through in silico analyses and confirmed by in vitro tests. *C. angulata* proteins were characterized by sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) and identified by proteomics techniques. Hydrolysis simulation by BIOPEP-UWM database revealed that pepsin (pH > 2) can theoretically release greatest amount of bioactive peptides from *C. angulata* proteins, predominantly angiotensin I-converting enzyme (ACE) and dipeptidyl peptidase IV (DPP-IV) inhibitory peptides, followed by stem bromelain and papain. Hydrolysates produced by pepsin, bromelain and papain have shown ACE and DPP-IV inhibitory activities in vitro, with pepsin hydrolysate (PEH) having the strongest activity of 78.18% and 44.34% at 2 mg/mL, respectively. Bioactivity assays of PEH fractions showed that low molecular weight (MW) fractions possessed stronger inhibitory activity than crude hydrolysate. Overall, in vitro analysis results corresponded with in silico predictions. Current findings suggest that in silico analysis is a rapid method to predict bioactive peptides in food proteins and determine suitable enzymes for hydrolysis. Moreover, *C. angulata* proteins can be a potential source of peptides with pharmaceutical and nutraceutical application.

**Keywords:** *Crassostrea angulata*; in silico; BIOPEP-UWM database; bioactive peptides; proteomics

#### **1. Introduction**

Oysters are the most popular and abundantly cultured shellfish in Taiwan [1]. They are considered as one of the major species of marine bivalves, comprising 33% of the total global production [2]. In most countries, these marine bivalves are consumed as food due to its health benefits, versatility, and easy-to-prepare characteristics. Furthermore, oysters have been used as raw materials for canning, bottling, and for the production of condiments like oyster sauce and powders. Despite being highly nutritious, oysters have not gained much attention because of its inherent characteristic flavor [3]. There are also some issues associated with post-harvest processing and handling of oysters, causing its low market value. Moreover, unprocessed oyster meat has a very short shelf life and are known to pose risk to public health [4,5]. Thus, many researchers are putting much effort into the search for new post-harvest application and development of high value products from oysters.

Hypertension and diabetes mellitus type-II are two of the most common chronic diseases affecting millions of people nowadays [6]. The development of these diseases is caused mainly by certain factors. Blood pressure is regulated by the renin-angiotensin system in the body. Renin catalyzes angiotensinogen to produce a vasodilator angiotensin I. Angiotensin-I converting enzyme (ACE) is an enzyme responsible for cleavage of angiotensin I, converting it to a potent vasoconstrictor angiotensin II [7,8]. Type 2 diabetes mellitus (T2DM) is characterized by hyperglycemia due to impaired insulin secretion, as a result of degradation of incretin hormones. During meals, endocrine cells release incretin hormones such as glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) [9]. These hormones stimulate pancreatic β-cell to boost glucose-dependent insulin secretion and suppress glucagon secretion, resulting to normal blood glucose levels [10,11]. The dipeptidyl peptidase IV (DPP-IV) is a ubiquitously expressed enzyme mainly involved in the modulation of biological activity of circulating peptide hormones by breaking down the two *N*-terminal amino acids X-Pro and X-Ala. Consequently, it could result to degradation and inactivation of numerous incretin hormones with Ala as the second *N*-terminal residue, such as GLP-1 and GIP [12,13]. These mechanisms have been the basis for the formulation of therapeutic drugs targeting those enzymes. Through the years, synthetic ACE and DPP-IV inhibitors are being tried in the management of these diseases [14–17]. However, these drugs are believed to cause negative effects to human health. Daily consumption of food containing ACE and DPP-IV inhibitory peptides are known to help lower blood pressure and blood sugar to healthy levels without exhibiting undesirable effects [18]. Thus, food proteins from natural sources are now being studied as alternative therapeutic agents.

Oyster is a rich source of proteins which generally ranges from 37%–81% on a dry weight basis [19–21]. In general, proteins contain peptides and essential amino acids which possess specific biological activity. Biologically active peptides are short sequences of amino acids that can be released from protein precursors through gastrointestinal digestion and food processing. They provide physiological effects in the body and function as regulatory compounds with hormone-like activity [22]. Peptides need to be released from the parent protein, be ingested, be bioaccessible, and reach the target site in sufficient quantities to exhibit biological bioactivity. Recently, several studies have been focused on the generation of bioactive peptides from food proteins and their utilization as functional ingredients [23]. Previous studies revealed that oyster is a good source of biologically active peptides with antioxidant, anti-cancer, ACE inhibitory, and anti-microbial activities in vitro and show antihypertensive activity in vivo [24–27]. Having these therapeutic potential, oysters can be considered as an alternative source of peptides that can be used as an ingredient for functional foods and nutraceuticals.

One of the most common methods used for the production of bioactive peptides from food proteins is by enzymatic hydrolysis. Traditionally, the selection of enzymes suitable for liberating potent peptides are based only on literature surveys, and in vitro analyses [28]. However, this approach is costly and time-consuming. Therefore, to overcome the drawbacks of this approach, in silico technique has been proposed and utilized. This technique is useful in predicting the release of bioactive peptides from known protein sequences and selecting suitable enzyme for hydrolysis [28,29]. Furthermore, the use of this technique for screening and identification of novel bioactive peptides had shown to be much more economical and time-saving [30].

Therefore, the objectives of this study were to assess the usefulness of in silico techniques in identifying the bioactive peptides encrypted in the *C. angulata* proteins and screening for the most suitable enzyme capable of releasing these peptides. Furthermore, it aimed to evaluate the bioactivities of *C. angulata* protein hydrolysate through in vitro analysis.

#### **2. Results and Discussion**

#### *2.1. Identified Proteins From C. angulata*

Freeze-dried Portuguese oyster (*C. angulata*) was subjected to SDS-PAGE to separate the proteins according to their molecular weights (MWs). Among all the bands observed in the SDS-PAGE gel, the eight most distinct bands were selected for further protein identification (Figure 1). These bands were subjected to in-gel digestion and nanoLC-nanoESI-MS/MS analysis and results obtained were then matched with the information from different protein databases through Mascot database search.

**Figure 1.** Protein patterns of Portuguese Oyster (*C. angulata*) by 12% sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE). M: Protein marker; FDOM: freeze-dried oyster (*C. angulata*) meat.

The identification of *C. angulata* proteins is based mainly on the occurrence of matched tryptic peptides from oyster (resulting from trypsin digestion) within the sequences of known proteins from the database. Based on the result from Mascot MS/MS ion search, all the identified tryptic peptides from oyster were listed as doubly or triply charged peptides. Figure 2 illustrates how the doubly charged peptide IDSLEGSVSR (MW = 1061.53 Da) and triply charged peptide LTQENFDLQHQVQELDAANAGLAK (MW = 2652.32 Da) from paramyosin (band B) were identified using the mass spectrum from nanoLC-nanoESI-MS/MS analysis. The doubly charged peptide has an observed signal *m*/*z* of 531.78 (Figure 2A). Insert (a) displays the 0.5 difference between the adjacent signals while insert (b) shows the fragmentation spectra of the identified peptide. On the other hand, triply charged peptide with an observed signal *m*/*z* 885.45 was distinguished by a 0.3 difference between the adjacent signals, as shown in insert (a) (Figure 2B). The final result and the fragmentation of this peptide was illustrated in insert (b).

**Figure 2.** NanoLC-nanoESI-MS/MS spectra (*m*/*z* region 350 to 800 Da and 690 to 930 Da) of oyster protein band B with representative spectra of identified tryptic peptides in doubly (**A**) and triply (**B**) charged signal.

Out of 16,598,945 protein sequences discovered through Mascot database search, 352 proteins under the genus *Crassostrea* were identified. In each band, one protein (highest scoring *Crassostrea* sp. protein) was selected for further screening and evaluation. From that, five proteins belonging to *Crassostrea gigas* were chosen based on their high protein scores and sequence coverage. These proteins are the myosin heavy chain from striated muscle isoform X1, paramyosin isoform X2, tropomyosin isoform X1, myosin regulatory light chain B from smooth adductor muscle isoform X2, and actin. The accession numbers, protein lengths, scores, sequence coverage, and MWs of the selected proteins were listed in Table 1.


**Table 1.** Identified proteins from Portuguese oyster (*C. angulata*) meat and their characteristics.

\* Selected proteins for in silico analysis (based on protein score and sequence coverage).

As observed, myosin heavy chain is the largest among the five selected proteins with a theoretical MW of 220 kDa while myosin regulatory light chain B is the shortest (18.63 kDa). These proteins are both found in the muscle of oysters and other mollusks. Myosin is a contractile protein which plays a big role in muscle contraction. It is composed of six subunits: Two heavy chains and 4 light chains [31]. Paramyosin, on the other hand, is one unique protein that can be found in invertebrates such as oyster. It forms the core protein of the thick filaments of oysters which generally ranges from 3% to 9% (*w*/*w*) and constitute 38% to 48% (*w*/*w*) of the total myofibril [32]. Similarly, actin and tropomyosin are known as important proteins in oyster specifically for muscle contraction. To validate the representation of these proteins and its utilization in the subsequent enzymatic hydrolysis simulation, BLAST analysis was performed to compare the level of homology of these *C. gigas* proteins towards its *C. angulata* counterparts. BLAST analysis of myosin essential light chain protein from both oyster species resulted in 157/157 (100%) identities, 157/157 (100%) positives and 0/157 (0%) gaps, indicating total similarity of these two *Crassostrea* species in terms of their protein sequences. Thus, the identified *C. gigas* proteins were used to represent the proteins of *C. angulata* in the subsequent in silico analysis.

#### *2.2. In Silico Prediction of Potential Bioactivities*

In silico analysis of oyster proteins by BIOPEP-UWM database revealed that all five selected proteins are good precursors of biologically active peptides, predominantly with DPP-IV and ACE inhibitory activities, with a total of 2179 and 1391 peptides, respectively (Table 2). Most of the reported DPP-IV inhibitory peptides contain P (proline) and/or hydrophobic amino acids in their sequences. GP and PG sequences are observed to be most frequently found in meat and fish [33]. On the other hand, proteins with hydrophobic amino acid residues (W, F, Y, or P) and positively-charged group (R or K) at the *C*-terminal positions or branched aliphatic side chains (V and I) at the *N*-terminal positions are known to possess strong ACE inhibitory activity and most of these peptides contain 2–12 amino acid residues [28].


**Table 2.** Total number of potential bioactive peptides from oyster proteins predicted in silico using BIOPEP-UWM database (accessed on 6 March 2019).

\* Other activities include antiamnestic, antibacterial, antioxidative, antithrombotic, neuropeptide, renin inhibitor, immunomodulating, stimulating, regulating, alpha glucosidase inhibitor, activating ubiquitin-mediated proteolysis, etc.

Most of the bioactive peptides discovered in the protein sequences of *C. angulata* are di and tripeptides (Table S1). The dipeptide AE is the most frequently occurring DPP-IV inhibitory peptide in *C. angulata* proteins, especially in myosin heavy chain, paramyosin, and tropomyosin. A (alanine) belongs to the hydrophobic group of amino acids and is also occurring in considerable amount in oysters. E (glutamic acid), on the other hand, is observed to be abundant in oyster species [19,34]. However, it was observed that the activity of peptides containing the same amino acid residues but occurring in different position could exhibit different biological property. For example, the dipeptide AE which was characterized with DPP-IV inhibitory activity was also observed to demonstrate inhibitory effect against ACE when occurred in reversed form. This only means that the activity of a peptide could vary depending on the type of amino acid forming the peptide and its position in the protein sequence.

Result of hydrolysis simulation using commonly used commercial enzymes is presented in Figure 3. Among the 9 enzymes used, pepsin (pH > 2) (EC 3.4.23.1) exhibited most of the DPP-IV and ACE inhibitory peptides theoretically, followed by stem bromelain (EC 3.4.22.32) and papain (EC 3.4.22.2). The effectiveness of these enzymes to release peptides with bioactivity depends mainly on its cleavage specificity. Pepsin has broad specificity and preferentially cleaves peptides with aromatic or carboxylic L-amino acid linkages, F and L at C-terminal location and to a lesser extent E linkages. However, it does not cleave at V, A, or G [35]. Stem bromelain exhibits strong cleavage preference for Z-R-R-I-NHMec among small molecule substrates while papain cleaves peptide bonds containing basic amino acids like arginine, lysine, and residues following phenylalanine [36,37]. The bioactive peptides released by different enzymes are listed in Table S2. With reference to in silico predictions, these three enzymes were chosen for use in the subsequent in vitro enzymatic hydrolysis.

**Figure 3.** Total number of bioactive peptides released in silico by commercial enzymes through BIOPEP-UWM's "Enzyme Action" tool (accessed on 6 March 2019). Other activities include antiamnestic, antibacterial, antioxidative, antithrombotic, neuropeptide, renin inhibitor, immunomodulating, stimulating, regulating, alpha glucosidase inhibitor, and activating ubiquitin-mediated proteolysis. DPP-IV: dipeptidyl peptidase IV.

#### *2.3. In Vitro Hydrolysis of Oyster Proteins*

Oyster protein isolate (OPI) was used as raw material for in vitro enzymatic hydrolysis. Among the three enzymes, pepsin gave the highest DH after 4 h of hydrolysis with a maximum value of 22.20 ± 0.97%, followed by papain and bromelain with 18.57 ± 0.61% and 17.86 ± 0.08%, respectively. The DH values of the three reactions increased rapidly from time 0 to 0.5 followed by a slower linear effect as hydrolysis time progresses (Figure S1). Generally, the rate of hydrolysis is faster during the initial stages of the reaction followed by a more static state and becomes steady when the highest DH is reached. Apparently, in this study, the reaction rate displayed an increasing trend even after 4 h which means that the highest DH for the three enzyme-catalyzed reactions was not yet achieved. One of the factors causing these slow reaction rates and low DH values is the low E/S ratio used in this study since a higher enzyme concentration would develop more cleavage activity. Moreover, the rate and extent of hydrolysis can also be affected by the secondary and tertiary structures of proteins. Some protein tertiary structures are sensitive to environmental conditions like acidic pH, making it unsusceptible to proteases and difficult to hydrolyze [38]. Hydrolysis condition, yield, degree of hydrolysis, and peptide content of the hydrolysates produced by different enzymes are summarized in Table 3.

**Table 3.** Hydrolysis conditions, yield, and peptide content of *C. angulata* protein hydrolysate. PEH: pepsin hydrolysate; BRH: bromelain hydrolysate; PAH: papain hydrolysate.


\* The yield was calculated based on the dry weight of the lyophilized hydrolysate over the dry weight of the protein isolate used during hydrolysis. Different superscript letters have significantly different (*p* < 0.05) mean values.

PEH, which demonstrated the highest DH, also obtained the highest yield (84.69%) among the three hydrolysate samples. However, in terms of peptide content, the value obtained by PEH was observed to be very comparable to that of PAH, despite of the differences in their DH values. This might be due to the unequal volume of solution at 4 h of hydrolysis wherein the collection of sample aliquots was done. The high temperature used during papain hydrolysis could have also led to evaporation which causes the reduction of sample volume and increase in concentration of peptides in the sample solution. Nevertheless, results showed that the increase in DH can lead to the production of more small peptides and free amino acids.

Based on the protein/ peptide patterns of *C. angulata* hydrolysates and fractions (Figure 4), all hydrolysates (PEH, BRH, and PAH) showed dispersion around and below 10 kDa, which were not observed in the OPI. Among the three hydrolysates, PEH has the highest concentration of low molecular weight peptides which is attributed to its high DH. The degradation of actin (previously identified by compiled proteomics techniques) to different extent are also evident in all hydrolyzed samples.

However, a band with the highest MW is observed to be visible even after hydrolysis, but appeared lighter in PAH than in PEH and BRH. This protein may be sensitive to high temperature applied during papain digestion. Moreover, the presence of light bands between 17 to 75 kDa indicates that there are still more protein substrates that were not cleaved even after 4 h of hydrolysis. This could be related to the low DH exhibited by the three enzymes. Overall, the electrophoretic pattern of *C. angulata* hydrolysate clearly supports the DH results, suggesting that pepsin's ability to break down oyster proteins into smaller peptides is better than bromelain and papain.

#### *2.4. Confirmation of Bioactivities Through In Vitro Tests*

#### 2.4.1. ACE Inhibitory Activity

Angiotensin-I converting enzyme (ACE) is an enzyme responsible for the regulation of blood pressure. It converts angiotensin I into a potent vasoconstrictor angiotensin II and degrades the vasodilator, bradykinin, thus leading to an increase in blood pressure [39]. In this study, the potency of the three hydrolysates (PEH, BRH, and PAH) as inhibitors of ACE was evaluated. As shown in Figure 5A, all hydrolyzed samples exhibited an inhibitory activity against ACE which means that hydrolysis of proteins with pepsin, bromelain and papain were able to generate potent ACE inhibitory peptides. PEH displayed higher ACE inhibitory activity in all concentrations than BRH and PAH. The inhibition rates of the hydrolysate samples were observed to be dose-dependent except for BRH wherein a slight deviation was noticed at 1 mg/mL. Furthermore, the highest ACE inhibitory activity was noted in PEH prepared at 2 mg/mL with a value of 78.18 ± 2.19%, followed by BRH and PAH with 52.97 ± 1.01% and 42.65 ± 4.73%, respectively. It can be seen that PEH which have shown higher DH and peptide content than the other two hydrolysate samples also gave stronger inhibitory effect against ACE. One of the reasons for this is the high levels of free amino acids and smaller peptides liberated

during hydrolysis that have ACE inhibitory properties. Basically, the biological activity of hydrolysates is influenced by the size, amount, composition of free amino acids and peptides, and the amino acid sequence [40]. This could also be associated to the cleavage specificity of pepsin, targeting the most bulky hydrophobic residues. The liberation of hydrophobic residues during pepsin hydrolysis results to their exposure to aqueous environment and susceptibility to reaction with different biomolecules, which leads to subsequent biological activities [41]. Overall, the results predicted in silico coincided with the results obtained in vitro with regards to the effectiveness of pepsin in releasing peptides with ACE inhibitory activity.

**Figure 5.** In vitro angiotensin I-converting enzyme (ACE) inhibitory activity of *C angulata* protein hydrolysates (**A**) and PEH fractions (**B**). Capital letters represent the significant difference (*p* < 0.05) among samples at specific concentrations; and small letters among concentrations within each sample. Each value (in percentage) represents the mean ± standard deviation (*n* = 3).

PEH was further separated into <1 kDa (F1), 1–5 kDa (F2), and >5 kDa (F3) fractions and their abilities to inhibit ACE activity were measured. The inhibition properties of PEH and peptide fractions against ACE followed a dose-dependent pattern in which an increase in concentration of peptides resulted in an increased inhibitory effect (Figure 5B). Result of ACE inhibitory activity assay revealed that F1 and F2 exhibited higher inhibitory activities (68.69 ± 0.82% and 65.95 ± 0.53%, respectively) compared to F3 (50.28 ± 0.09%) and PEH (60.32 ± 0.53%). Generally, peptides with very low MW are known to be most suitable for the formulation of therapeutic agents since these peptides can resist gastrointestinal digestion, thereby can be absorbed into the blood circulatory system in an intact form [42].

To test the inhibitory efficiency of crude PEH and fractions with respect to their MWs, the inhibitory efficiency ratio was calculated. Table 4 shows the peptide content, yield, and inhibition efficiency ratio of PEH, F1, F2, and F3. Result shows that F1 exhibited greatest efficiency in inhibiting ACE activity with an IER value of 217.05%/mg/mL compared to PEH and high MW fractions (F2 and F3), despite its low peptide content. This value is comparable to the IER of hard clam peptide fraction with a MW of 1360–1180 Da [43]. Analysis result indicates that products containing small peptides possess stronger ACE inhibitory activity. In addition, several studies have reported that those peptides with strong ACE inhibition are generally short peptides [44]. In most cases, peptides which contain 3–20 amino acids have greater potency as bioactive peptides than parent proteins [45].

The same with the unfractionated hydrolysate samples, the ACE inhibition properties of peptide fractions against ACE were observed to be dose-dependent in which an increase in the concentration of peptides resulted in increased inhibitory effect. Moreover, the inhibitory activity presented by PEH and its fractions is about half of the inhibitory activity of Captopril analyzed in this study (93.04%). Overall, results suggest that *C. angulata* proteins could be an important source of peptides that are capable of ACE inhibition.


**Table 4.** Inhibitory activity, peptide content, yield, and inhibitory efficiency ratio of pepsin hydrolysate and peptide fractions.

\* Yield was calculated based on the dry weight of the lyophilized hydrolysate and fractions over the dry weight of the protein isolate and hydrolysate used during hydrolysis. <sup>a</sup> IER (inhibitory efficiency ratio) = % inhibition/peptide content. Different superscript letters represent significant difference between mean values (*n* = 3) at *p* < 0.05.

#### 2.4.2. DPP-IV Inhibitory Activity

Dipeptidyl peptidase-IV (DPP-IV) is a postproline-cleaving enzyme that causes the degradation of incretins GLP-1 and GIP, leading to an increase in the blood glucose level. In this study, the ability of PEH, BRH, and PAH to inhibit DPP-IV was measured in vitro. Figure 6A shows that all the hydrolysate samples produced by different enzymes were able to inhibit DPP-IV activity. The strongest inhibition was observed in PEH prepared at 2 mg/mL (44.37 ± 0.09%), followed by BRH (23.98 ± 0.07%) and PAH (23.44 ± 1.44%). These results are in agreement with the in silico predictions. This strong inhibitory activity of PEH can be related to the ability of pepsin to cleave peptides with aromatic amino acid linkages. Previous in silico studies have shown that DPP-IV inhibitory peptides usually have a branched-chain amino acid or an aromatic residue containing a polar group in the side chain (primarily W) at their *N*-terminal position and/or P residue located at their P1 [46]. In addition, the inhibitory activities of the hydrolysates were observed to be dose-dependent.

**Figure 6.** In vitro DPP-IV inhibitory activity of *C angulata* protein hydrolysates (**A**) and PEH fractions (**B**). Capital letters represent the significant difference (*p* < 0.05) among samples at specific concentrations; and small letters among concentrations within each sample. Each value (in percentage) represents the average of three samples ± standard deviation (*n* = 3).

PEH, the hydrolysate with highest inhibitory activity, was subjected to fractionation and the DPP-IV inhibitory activities of the fractions were also examined. Results show that the ability of all fractions to inhibit DPP-IV activity was observed to increase with increasing concentration. Among the samples, F1 presented the strongest inhibitory activity at different concentrations. However, for 1 mg/mL, F2 (55.08 ± 1.98%) displayed higher inhibition value than F1 (48.42 ± 0.06%) (Figure 6B). Bioactivity of peptides does not only depend on molecular weight, but also on other factors like amino acid composition and sequences in their chemical structure [47]. Results revealed that low MW

fractions demonstrated higher efficiency as inhibitors of DPP-IV than high MW fractions with reference to their IER values (Table 4). Moreover, the inhibitory activity of Diprotin A (98.83%) obtained in this study was about 50% higher compared to that of the PEH fractions. Nevertheless, the above findings suggest that peptides from *C. angulata* proteins can be a good alternative for bioactive peptides suitable for DPP-IV inhibition.

#### **3. Materials and Methods**

#### *3.1. Materials*

Portuguese oysters (*Crassostrea angulata*) were purchased from Penghu Island, Taiwan. They were packed in a box with ice and sent to the laboratory by freight transport. Pepsin (from porcine gastric mucosa), bromelain (from pineapple stem), and papain (from papaya) were obtained from Sigma-Aldrich (St. Louis, MO, USA). The angiotensin I converting enzyme (ACE) from rabbit lung (≥2 units/mg), *N*-(3-[2-furyl]-acryloyl)-phenylalanyl glycyl glycine (FAPGG), Dipeptidyl Peptidase IV (DPP-IV) from human recombinant (≥1 unit/mg), and Gly-Pro p-nitroanilide hydrochloride (≥99%) were also acquired from Sigma-Aldrich, USA. All chemical reagents used were of analytical grade.

#### *3.2. Oyster Meat Preparation*

Oysters (*C. angulata*) were manually shucked and the collected meat was washed with tap water and homogenized for 10 s using a food blender. The homogenized oyster meat was lyophilized for 48 to 72 h. It was then grounded into a fine powder (100 mesh). The resulting oyster powder were stored at −20 ◦C until further analysis.

#### *3.3. Proteomics Techniques and In Silico Analysis*

#### 3.3.1. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis (SDS-Page) Analysis

*C. angulata* proteins were separated through SDS-PAGE as described by Laemmli [48]. Firstly, 1 mg of sample (dry weight, protein basis) was diluted in 1 mL sample buffer [0.5 M Tris–HCl (pH 6.8), glycerol, 10% (*w*/*v*) SDS, 0.5% (*w*/*v*) bromophenol blue, and β-mercaptoethanol]. The solution was then heated at 95 ◦C for 4 min and centrifuged at 4000× *g* for 15 min prior to loading. Electrophoresis was performed in a 12% running gel (ddH2O, 30% Acrylamide/Bis (37.5:1), 1.5 M Tris-HCl (pH 8.8), 10% (*w*/*v*) SDS, 10% (*w*/*v*) ammonium persulfate and TEMED) and 4% stacking gel (ddH2O, 30% Acrylamide/Bis (37.5:1), 0.5 M Tris-HCl (pH 6.8), 10% (*w*/*v*) SDS, 10% (*w*/*v*) ammonium persulfate and TEMED). Ten (10) μL of sample and 5 μL of standard (AccuRuler RGB prestained protein ladder, MaestronGen Inc., Taiwan) were loaded into each well of the gel. The voltage of power supply was set at 70 V for stacking gel and 110 V for running gel. After electrophoresis, the gel was stained with Coomassie Brilliant Blue for 30 min and subsequently destained with water/methanol/acetic acid (7/2/1, *v*/*v*/*v*) solution for 15 min with continuous shaking. The gel was then scanned using a gel image scanner and the MW of the visible bands was determined using the VisionCapt software (V16.08a, Vilber Lourmat, Paris, France).

#### 3.3.2. Gel Slice and In-Gel Digestion

Distinct protein bands from SDS-PAGE gel were sliced and subjected to in-gel digestion following the method of Shevchenko and group [49] with some modifications. The sliced bands were cut into small cubes measuring around 1 mm3. The gel sample was then placed in a siliconized eppendorf tube and spun down. Complete destaining was performed by subsequent addition of 50% and 25% acetonitrile/25 mM ammonium bicarbonate solution. Afterward, the gel sample was added with 100 μL DTE solution (50 mM dithioerythreitol/25 mM ammonium bicarbonate) and soaked for 1 h at 37 ◦C to break the disulfide bonds. After incubation, the mixture was spun down and the DTE solution was removed completely. The reduced gel sample was subjected to alkylation by incubating it with 100 μL

of IAM solution (100 mM iodoacetamide/25 mM ammonium carbonate) for 1 h in the dark. After which, the gel sample was washed with 200 μL of 50% acetonitrile/25 mM ammonium bicarbonate for 15 min. The mixture was spun down and the buffer was pipetted out. Washing step was repeated 4 times to ensure that all buffers were removed completely. The gel sample was then soaked with 100 μL of 100% acetonitrile until it hardened and turned white. After removing all the acetonitrile, the remaining gel sample was dried for about 5 min in a SpeedVac concentrator. Digestion was performed by adding Lys-C/25 mM ammonium bicarbonate into the gel sample (1:50, enzyme:protein) followed by 3 h incubation at 37 ◦C. After that, the mixture was added with the same amount of trypsin and incubated at the same temperature for at least 16 h to complete the digestion. Prior to extraction, the enzyme was deactivated first by adding 50 μL of 50% acetonitrile/5% TFA to the mixture and sonicating it ten times (with 10 s interval). The mixture was centrifuged and the supernatant containing peptides was aspirated and transferred into a new tube. The remaining gel sample was extracted again following the above steps and the supernatant collected from the first and second extraction was combined and dried in a SpeedVac concentrator. The dried gel sample was then subjected to zip-tip purification prior to LC-ESI-MS/MS analysis.

#### 3.3.3. NanoLC–NanoESI–MS/MS Analysis

The MS/MS data of the peptide mixtures were acquired from the Institute of Biological Chemistry, Academia Sinica, Nangang District, Taipei, Taiwan. The liquid chromatographic separation was done using C18 column and separated using a segmented gradient in 60 min from 5% to 35% solvent B (acetonitrile with 0.1% formic acid) at a flow rate of 300 nl/min and a column temperature of 35 ◦C. Solvent A was prepared with 0.1% formic acid in water. The mass spectrometry analysis was performed in a data-dependent mode and full scan MS spectra was acquired in the orbitrap (*m*/*z* 350–1600) with the resolution set to 60K at *m*/*z* 400 and automatic gain control (AGC) target at 106. The 20 most intense ions were sequentially isolated for CID MS/MS fragmentation and detection in the linear ion trap (AGC target at 10,000) with previously selected ions dynamically excluded for 60 s. Ions with singly and unrecognized charged state were also excluded.

#### 3.3.4. Mascot Database Search

The collected MS/MS raw data were converted into MGF files and subjected to Mascot database search to identify the proteins and/or peptides detected by MS. The data were searched against National Center for Biotechnological Information (NCBI) Database for metazoa (animals). Search parameters used were carbamidomethyl (C) and oxidation (M) for variable modifications, ±10 ppm for peptide mass tolerance, 2+, 3+, and 4+ for peptide charge, ±0.6 Da for fragment mass tolerance, 2 for maximum missed cleavages, ESI-trap for the instrument used, and trypsin for the enzyme applied. All peptide masses were obtained as monoisotopic masses.

The Mascot ion score was −10\*Log (P), where P is the probability that the observed match is a random event. Protein scores were derived from ion scores as a non-probabilistic basis for ranking protein hits. Mascot search results presented the protein sequence coverage in percentage (%) indicating the sequence homology of identified tryptic peptides from oyster fractions to corresponding protein hits.

#### 3.3.5. BLAST Analysis of Oyster Proteins

Protein sequence of myosin essential light chain protein from *C. angulata* was compared to its*C. gigas* counterpart using BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi), as described by Huang et al. [50], to determine the homology between the two proteins based on score, identities (%), positives (%), and gaps (%).3.3.6. BIOPEP-UWM analysis of bioactive peptides and enzyme cleavages.

The protein sequences obtained from the Mascot database search were analyzed using the BIOPEP database (http://www.uwm.edu.pl/biochemia/index.php/pl/biopep) to predict their bioactive peptide profile and enzyme cleavages. The activities, sequences, numbers and locations of bioactive peptides in protein sequences were determined using "profiles of potential biological activity" tool. The identified protein sequences were then examined through "enzyme action" tool wherein hydrolysis was simulated using different commercial enzymes available in the database. The theoretical peptides released by each enzyme were then directed to "search for active fragments" tool and the peptides with the highest number of potential bioactivities were selected for further analysis.

#### *3.4. In Vitro Analyses*

#### 3.4.1. Protein Isolation

*C. angulata* proteins were isolated using alkaline solubilization/ isoelectric precipitation method described by Huang et al. [50]. In brief, the lyophilized oyster meat was mixed with 0.1 M NaOH at a ratio of 1:20 (powder: NaOH, *w*/*v*) and stirred for 2 h at room temperature. The mixture was centrifuged for 10 min at a speed of 8000× *g* at 4 ◦C. The supernatant was collected and adjusted to pH 5.5 with 0.1 N HCl to precipitate the myofibrillar proteins. The pH modified mixture was centrifuged at 8000× *g* speed for 10 min at 4 ◦C and the collected precipitate was lyophilized and stored at −20 ◦C until further analysis. The yield of the protein isolate was calculated based on the dry weight of oyster protein isolate over the dry weight of oyster meat used in isolation multiplied by 100.

#### 3.4.2. Enzymatic Hydrolysis

Enzymatic hydrolysis was done following the combined methods of Dong et al. [19] and Jun et al. [51] with modifications. The oyster protein isolate (1 g, protein basis) was suspended in 100 mL deionized water. The homogenate was adjusted to 37 ◦C, pH 2 for pepsin; 50 ◦C, pH 7 for bromelain; and 65 ◦C, pH 7 for papain. The reaction started after adding the enzyme (1:100, *E*/*S*) and continued until 4 h. During hydrolysis, 1 mL aliquot of samples was taken at different time intervals (0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, and 4) for degree of hydrolysis determination. The enzyme activity was terminated by raising the temperature to 95 ◦C for 10 min. The sample was then cooled and centrifuged at 8000× *g* for 10 min. The recovered supernatant was neutralized to pH 7, lyophilized, and then stored at −20 ◦C for further analysis. The resulting dried hydrolysate was labeled as PEH (pepsin hydrolysate), BRH (bromelain hydrolysate), and PAH (papain hydrolysate).

#### 3.4.3. Degree of Hydrolysis

The degree of hydrolysis was calculated based on the amount of free amino groups using modified o-phthalaldehyde method by Charoenphun et al. [52]. The OPA reagent was prepared fresh by mixing 12.5 mL of 100 mM sodium tetraborate, 1.25 mL of 20% sodium dodecyl sulfate (SDS), 20 mg of OPA reagent dissolved in 0.5 mL methanol and 50 μL of β-mercaptoethanol. The final volume of the solution was adjusted to 25 mL by adding deionized water. In a 96-well microplate, 10 μL of hydrolysate sample/blank/standard was combined with 200 μL of OPA reagent and incubated at 37 ◦C for 100 s. The absorbance was read at 340 nm using a UV/Vis microplate spectrometer (SPECTROstar Nano, BMG LABTECH) and the amount of amino groups was quantified using Gly-Gly-Gly as standard. The total number of primary amino groups in the sample was determined by performing a complete acid hydrolysis using 6N HCl for 24 h at 110 ◦C. The % DH was calculated as follows:

$$\text{IDH} \left( \% \right) = [\text{(NH}\_2\text{)}\_{\text{lx}} - (\text{NH}\_2)\_{\text{0}}] \{ (\text{NH}\_2)\_{\text{total}} - (\text{NH}\_2)\_{\text{0}} \} \times 100\% \tag{1}$$

where, (NH2)tx is the amount of free amino groups at X min; and (NH2)total is the amount of total amino groups by total acid hydrolysis. (NH2)t0 represents the amount of free amino groups at 0 min of hydrolysis.

#### 3.4.4. Fractionation

PEH was subjected to fractionation using Lefo Science-Spectrum Labs MAP-TFF Systems with a molecular weight cut-off of 5 and 1 kDa. The sample was dissolved in distilled water at 1% (*w*/*v*) concentration and placed in the ultrafiltration hollow fiber membrane. The recovered permeates, classified as F1 (<1 kDa), F2 (1–5 kDa), and F3 (>5 kDa) were then lyophilized and stored at −20 ◦C until further analysis. The yield of each fraction was calculated based on the dry weight of the fractions over the dry weight of the hydrolysate used multiplied by 100.

#### 3.4.5. Peptide Content Determination

The peptide contents of the hydrolysates and fractions were measured using the modified OPA method by Charoenphun et al. [52], previously used in measuring the degree of hydrolysis. The peptide content was quantified using Gly-Gly-Gly as standard.

#### 3.4.6. Angiotensin-I Converting Enzyme (ACE) Inhibitory Activity Assay

The ACE inhibitory activities of the protein hydrolysates and fractions were determined following the combined method of Raghavan and Kristinsson [53] and Udenigwe et al. [54]. In this method, *N*-[3-(2-furyl) acryloyl]-l-phenylalanyl glycyl glycine (FAPGG) was used as synthetic substrate for ACE. Each assay sample was dissolved in 50 mM Tris-HCl buffer (pH 7.5) containing 0.3 NaCl at a final assay concentration of 0.5, 1.0, and 2.0 mg protein/mL (for hydrolysates) and 0.25, 0.5, and 1 mg protein/mL (for fractions). In a 96-well microplate, 20 μL of sample was combined with 170 μL of 0.5 mM FAPGG solution and pre-incubated at 37 ◦C for 10 min. Buffer solution was used as control. Thereafter, 10 μL of 0.5 U/mL ACE (pre-heated at 37 ◦C for 10 min) was added to each well and the rate of decrease in absorbance at 345 nm was checked and recorded for 30 min at 1 min interval using a UV/Vis microplate spectrometer (SPECTROstar Nano, BMG LABTECH) preset to 37 ◦C. Captopril (1 mg/mL) was used as a reference inhibitor for the assay. The ACE inhibitory activity was calculated as:

$$\text{ACE inhibitory activity (\%)} = \left[ \Delta \text{Amin}^{-1} \text{ (control)} - \Delta \text{Amin}^{-1} \text{ (sample)} \Delta \text{Amin}^{-1} \text{ (control)} \right] \times 100\% \tag{2}$$

where ΔA min−<sup>1</sup> (sample) is the ACE activity in the presence of peptides while ΔA min−<sup>1</sup> (control) is the ACE activity in the absence of peptides.

#### 3.4.7. Dipeptidyl Peptidase IV (DPP-IV) Inhibitory Activity Assay

The inhibitory activities of the protein hydrolysates and fractions against the enzyme dipeptidyl peptidase-IV (DPP-IV) were determined following the combined methods of Lacroix and Li-Chan [55] and Zhang et al. [56]. Hydrolysate samples and fractions were dissolved in 100 mM Tris buffer (pH 8) to obtain a final assay concentration of 0.5, 1, and 2 mg protein/mL and 0.25, 0.5, and 1 mg protein/mL, respectively. In a 96-well microplate, 25 μL of assay sample was added with 25 μL of 1.6 mM Gly-Pro-p-nitroanilide and pre-incubated at 37 ◦C for 10 min. The mixture was then added with 50 μL of 0.008 U/mL DPP-IV (diluted with the same Tris-HCl buffer). Diprotin A (Ile-Pro-Ile) was used as reference inhibitor. Each sample was analyzed in triplicate and Tris-HCl buffer was used as blank. The positive control (DPP-IV activity in the absence of inhibitor) and negative control (no DPP-IV activity) was prepared using the same buffer solution in place of the sample and DPP-IV solution, respectively. The increase in absorbance per min was read in the UV/Vis microplate spectrometer (SPECTROstar Nano, BMG LABTECH) for 30 min at 37 ◦C and the rate of DPP-IV inhibition was calculated using the following equation:

$$\text{DPP-IV inhibition activity (\%)} = \left[1 - (\text{A}\_{\text{s}} - \text{A}\_{\text{b}}) / (\text{A}\_{\text{pc}} - \text{A}\_{\text{nc}})\right] \times 100\tag{3}$$

where As, Ab, Apc, and Anc are the absorbance of the sample, blank, positive control, and negative control, respectively.

#### 3.4.8. Calculation of Inhibition Efficiency Ratio

The inhibition efficiency ratio of PEH and fractions was calculated based on the inhibitory activity (%) over the peptide content (mg/mL) of each sample.

#### *3.5. Statistical Analysis*

Data were presented as mean ± SD (standard deviation) for three replications for each sample. Data were analyzed using IBM SPSS (Statistical Package for Social Science,) version 20.0 (New York, NY, USA) for One-way Analysis of Variance (ANOVA) followed by Tukey's post-hoc test to estimate the significance among the main effects at the 5% probability level.

#### **4. Conclusions**

The application of in silico technique provided a rapid and reliable information on the identification of bioactive peptides from *C. angulata* proteins, and in the determination of suitable enzyme for the generation of these peptides. The results have shown the correspondence between in silico prediction and in vitro confirmation. Based on the above findings, *C. angulata* protein hydrolysates can be a good source of peptides with ACE and DPP-IV inhibitory activities. Moreover, pepsin (pH > 2) demonstrated most promise in releasing bioactive peptides form *C. angulata* proteins both in silico and in vitro. Furthermore, fractionation enhanced the ability of the hydrolysate to inhibit ACE and DPP-IV activities. Overall, peptides from *C. angulata* proteins can be an alternative source of bioactive peptides capable of ACE and DPP-IV inhibition and can be used as a functional ingredient with pharmaceutical and nutraceutical applications. However, in vivo testing is highly suggested to ensure safety and stability of these peptides during gastrointestinal digestion.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1422-0067/20/20/5191/s1.

**Author Contributions:** Conceptualization, H.L.R.G., J.P.P., and Y.-W.C.; Methodology, H.L.R.G. and L.A.T.; Validation, H.L.R.G.; Formal Analysis, H.L.R.G.; Investigation, H.L.R.G.; Resources, H.L.R.G. and Y.-W.C.; Data Curation, H.L.R.G., J.P.P. and Y.-W.C.; Writing—Original Draft Preparation, H.L.R.G.; Writing—Review and Editing, L.A.T.; Visualization, H.L.R.G.; Supervision, J.P.P. and Y.-W.C.; Funding acquisition, J.P.P. and Y.-W.C.

**Funding:** This research was funded by Ministry of Science and Technology, Taiwan (MOST: 106-2311-B-019-001) and Department of Science and Technology, Philippines.

**Acknowledgments:** We thank Academia Sinica Common Mass Spectrometry Facilities, Institute of Biological Chemistry, Academia Sinica, supported by Academia Sinica Core Facility and Innovative Instrument Project (AS-CFII-108-107) for the LTQ-Orbitrap data acquired.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **Protective E**ff**ects of Novel Antioxidant Peptide Purified from Alcalase Hydrolysate of Velvet Antler Against Oxidative Stress in Chang Liver Cells In Vitro and in a Zebrafish Model In Vivo**

#### **Yuling Ding 1,**†**, Seok-Chun Ko 2,**†**, Sang-Ho Moon <sup>3</sup> and Seung-Hong Lee 1,\***


Received: 23 September 2019; Accepted: 18 October 2019; Published: 19 October 2019

**Abstract:** Velvet antler has a long history in traditional medicine. It is also an important healthy ingredient in food as it is rich in protein. However, there has been no report about antioxidant peptides extracted from velvet antler by enzymatic hydrolysis. Thus, the objective of this study was to hydrolyze velvet antler using different commercial proteases (Acalase, Neutrase, trypsin, pepsin, and α-chymotrypsin). Antioxidant activities of different hydrolysates were investigated using peroxyl radical scavenging assay by electron spin resonance spectrometry. Among all enzymatic hydrolysates, Alcalase hydrolysate exhibited the highest peroxyl radical scavenging activity. Alcalase hydrolysate was then purified using ultrafiltration, gel filtration, and reverse-phase high performance liquid chromatography. The purified peptide was identified to be Trp-Asp-Val-Lys (tetrapeptide) with molecular weight of 547.29 Da by Q-TOF ESI mass spectroscopy. This purified peptide exhibited strong scavenging activity against peroxyl radical (IC50 value, 0.028 mg/mL). In addition, this tetrapeptide showed significant protection ability against AAPH-induced oxidative stress by inhibiting of reactive oxygen species (ROS) generation in Chang liver cells in vitro and in a zebrafish model in vivo. This research suggests that the tetrapeptide derived from Alcalase-proteolytic hydrolysate of velvet antler are excellent antioxidants and could be effectively applied as functional food ingredients and pharmaceuticals.

**Keywords:** velvet antler; alcalase hydrolysate; antioxidant peptide; protection ability; oxidative stress

#### **1. Introduction**

Reactive oxygen species (ROS) are chemically reactive species containing oxygen. ROS are normally produced in living organisms during metabolism of oxygen. Under normal conditions in our body, ROS can be effectively eliminated by antioxidant defense systems such as endogenous antioxidant enzymes and non-enzymatic factors [1]. However, overproduction of ROS by various factors can cause oxidative stress and lead to a variety of pathological conditions, including metabolic impairments such as inflammation, aging, cancer, and cardiovascular diseases [2]. Therefore, sufficient amount of antioxidants need to be consumed to prevent or slow down oxidative stress induced by ROS. The amount of synthetic antioxidants used by humans is under strict regulation due to their potential health hazards [3]. Thus, natural antioxidants without side effects or toxicity have attracted great interest.

Food-derived peptides have shown to be potent antioxidants without serious side effects [4]. As such, to discover bioactive peptides from food proteins and to develop the peptides as alternatives to synthesis antioxidants has been considered by many researchers. In addition, the peptides from gastrointestinal digested food proteins may act as potential physiological modulators of metabolism during gastrointestinal digestion [5,6]. Several recent studies have suggested that animal proteins are good sources to produce antioxidant peptides and demonstrated that animal proteins hydrolystes and/or its antioxidant peptides may promote health by decreasing oxidative stress [7–9].

Velvet antler is a typical traditional medicine from animal origin that is recognized in the pharmacopeias of Korea, China, and Japan. In has been used as traditional medicine for over 2000 years. It also used as a functional foods or nutraceutical supplement in New Zealand, Canada, and the USA [10]. The reports support that main prominent bioactive components of velvet antler are polypeptides and proteins [11]. The traditional extraction of bioactive components from velvet antler is generally done via simmering in hot water. However, there is some controversy surrounding this approach, due largely to the extremely limited recovery of bioactive components in water extractions. In recent years, enzymatic hydrolysis using commercial proteases has been successfully applied to extraction of numerous biologically active peptides from a wide variety of food proteins and organisms. Recently, several studies have reported that enzymatic hydrolysate extracted from velvet antler using commercial proteases such as Alcalase, Protamex, pepsin, and Neutrase show a variety of biological benefits, including anti-obesity, anti-inflammatory, and antioxidant effects [12–14]. Anti-inflammatory peptides derived from velvet antler protein have also been reported [15].

However, to the best of our knowledge, there have been no reports about antioxidant peptides extracted from velvet antler by enzymatic hydrolysis. Therefore, the objective of the present study was to evaluate antioxidant activities of hydrolysates from velvet antler prepared with five commercial proteases (Acalase, Neutrase, trypsin, pepsin, and α-chymotrypsin) and identify amino acid sequences of purified peptides with the strongest antioxidant activity. Protective effects of purified peptides against 2,2 -Azobis(2-amidinopropane) dihydrochloride (AAPH)-induced oxidative stress in Chang liver cells and a zebrafish model were also investigated.

#### **2. Results**

#### *2.1. Preparation of Enzymatic Hydrolysates from Velvet Antler and Their Peroxyl Radical Scavenging Activities*

Velvet antler was successfully hydrolyzed with various commercial proteases such as trypsin, pepsin, α-chymotrypsin, Neutrase, and Alcalase to produce potent antioxidant peptides. Yields of velvet antler enzymatic hydrolysates measured by dry weight were observed to be 34.09%, 12.39%, 38.96%, 23.81%, and 29.75% for trypsin, pepsin, α-chymotrypsin, Neutrase, and Alcalase, respectively (Table 1). Antioxidant activities of these hydrolysates against peroxyl radical were examined using an ESR spectrometer. Their scavenging activities are shown in Table 1. Among these hydrolysates, Alcalase-derived hydrolysate possessed the highest peroxyl radical scavenging activity. Although the trypsin hydrolysate also showed a potent peroxyl radical scavenging activity, Alcalase can produce shorter peptide sequences and terminal amino acid sequences responsible for various bioactivities as well as useful for the production of bioactive peptide [16–18]. Therefore, Alcalase hydrolysate was selected to identify antioxidant peptide for further studies.


**Table 1.** Extraction yield and peroxyl radical scavenging activities of enzymatic hydrolysates from velvet antler.

These values are expressed as mean ± S.E. from triplicate experiments. 1) Radical scavenging activity was measured at 1 mg/mL by ESR spectrometry.

#### *2.2. Purification and Identification of Antioxidant Peptide*

Initially, the Alcalase hydrolysate of velvet antler was cut off by two kinds of ultrafiltration membranes (5 and 10 kDa, MWCO). Three fractions with different molecular weights (>10 kDa, 5–10 kDa, and <5 kDa) were obtained. Peroxyl radical scavenging activities of these three separated fractions are shown in Table 2. The <5 kDa fraction possessed the highest peroxyl radical scavenging activity. IC50 (the half maximal inhibitory concentration) value of the <5 kDa fraction was 0.26 mg/mL, lower than that of the >10 kDa fraction (0.30 mg/mL) or the 5–10 kDa fraction (0.35 mg/mL). Accordingly, the <5 kDa fraction was further purified and separated using a Sephadex G-25 column. As shown in Figure 1A, four fractions were obtained. Their peroxyl radical scavenging activities were then determined. Fraction 3 (Fr. 3) exhibited the strongest peroxyl radical scavenging activity, with IC50 value of 0.12 mg/mL. Thus, Fr. 3 was further separated by RP-HPLC and five main fractions were obtained (Figure 1B). Fraction 3-3 (Fr. 3-3) showed the strongest peroxyl radical scavenging activity with an IC50 value of 0.028 mg/mL (Figure 1B), suggesting that Fr. 3-3 could possess potent antioxidant activity by scavenging peroxyl radicals. Thus, amino acid sequences of Fr. 3-3 was determined using Q-TOF ESI mass spectrometer. The purified peptide was identified as a tetrapeptide Trp-Asp-Val-Lys (named TAVL). The molecular weight of this tetrapeptide was 547.29 Da (Figure 1C). As shown in Table 3, the IC50 value of the TAVL was 51.16 μM for peroxyl radical scavenging activity, whereas positive control as ascorbic acid showed 19.26 μM of IC50 value, indicating the ascorbic acid is more potent scavenging activity. However, based on this results, we can confirm the superior antioxidant activity of TAVL.

**Table 2.** Peroxyl radical scavenging activities of different molecular weight fractions from Alcalase hydrolysate of velvet antler.


These values are expressed as mean ± S.E. from triplicate experiments. a,b Values with different alphabets are significantly different at *p* < *0.05* as analyzed by Duncan's multiple range test.

**Figure 1.** Purification and identification of antioxidant peptide. (**A**) Sephadex G-25 gel filtration chromatogram of <5 kDa fraction from Alcalase hydrolysate (upper panel) and its peroxyl radical scavenging activity (lower panel). (**B**) RP-HPLC chromatogram of the potent peroxyl radical scavenging activity fraction (Fr. 3) isolated from G-25 (upper panel) and its peroxyl radical scavenging activity (lower panel). (**C**) Identification of amino acid sequence and (left panel) molecular weight (right panel) of the purified peptide (TAVL) from Alcalase hydrolysate of velvet antler with a Q-TOF ESI mass spectrometer. These values are expressed as mean <sup>±</sup> S.E. from triplicate experiments. a–c Values with different alphabets are significantly different at *p* < 0.05 as analyzed by Duncan's multiple range test.

**Table 3.** Comparison with peroxyl radical scavenging activity by the tetrapeptide (TAVL) and ascorbic acid.


These values are expressed as mean ± S.E. from triplicate experiments.

#### *2.3. Intracellular Antioxidant Activities of the Purified Peptide*

The cytotoxicity of the purified peptide (TAVL) was determined by 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay at multiple TAVL concentrations (25, 50, 100, 200, 400, 800, and 1600 μg/mL) prior to evaluating its intracellular antioxidant activities. Results revealed that TAVL did not exhibit cytotoxicity at concentrations of up to 400 μg/mL, as compared with control survival (Figure 2A). Therefore, TAVL of non-toxic concentration was used to examine its protective effect against AAPH-induced cell damage in Chang liver cells. As shown in Figure 2B, AAPH treatment without TAVL significantly decreased cell viability. However, TAVL protected cells against cellular damage induced by AAPH in a dose-dependent manner. The generation of intracellular ROS could be measured by analyzing DCF fluorescence intensity levels. As Figure 2C shows, the fluorescence intensity of control group (AAPH and TAVL-untreated negative control) was recorded as 137, and the fluorescence intensity of only AAPH-treated cells was recorded as 3272. However, pretreatment of TAVL at 25, 50, 100, 200, and 400 μg/mL to cells mixed with AAPH reduced fluorescence intensities (intracellular ROS production levels) to 3202, 2899, 2806, 2719, and 2420, respectively. These results suggest that this antioxidant peptide (TAVL) could be developed into a potential bio-molecular candidate to inhibit cellular damage and intracellular ROS formation. Since this antioxidant peptide (TAVL) was found to exert antioxidant effects, its protective effect against AAPH-induced oxidative stress was further investigated using a zebrafish model in vivo.

**Figure 2.** Protective effects of purified peptide (TAVL) against 2,2-Azobis-(2-amidinopropane) dihydrochloride (AAPH)-induced oxidative stress in Chang liver cells. Cells were treated with TAVL at indicated concentrations. (**A**) Cytotoxic effect of TAVL on viability of normal cells. After 24 of treatment with TAVL, cell viabilities were assessed by MTT assay. (**B**) Effect of TAVL on cell viability of AAPH-treated Chang liver cells. Cell viabilities were assessed by MTT assay. (**C**) Effect of TAVL on intracellular ROS generation in AAPH-treated Chang liver cells. Intracellular ROS generated was detected by 2 ,7 -dichlorodihydrofluorescein diacetae (DCFH-DA) assay. Values are expressed as mean ± S.E. from triplicate experiments. Significant differences from only AAPH-treated group (positive control) were identified at \* *p* < 0.05, \*\* *p* < 0.01 as analyzed by Duncan's multiple range test. The control group represents the negative control that does not receive treatment of AAPH and sample in an experiment.

#### *2.4. Protective E*ff*ects of Antioxidant Peptide (TAVL) against AAPH-Induced Oxidative Stress in a Zebrafish Model In Vivo*

AAPH-induced oxidative stress could eventually lead to cell death and overproduction of ROS and lipid peroxidation. In the present study, the protective effects of TAVL against AAPH-induced cell death, ROS generation, and lipid peroxidation in the zebrafish were investigated. As shown in Figure 3A, cell death of zebrafish was significantly elevated by AAPH treatment compared to non-AAPH-treated zebrafish. However, cell death induced by AAPH in zebrafish was remarkably reduced by treatment with TAVL in a dose-dependent manner. Effects of TAVL on AAPH-induced ROS generation and lipid peroxidation level are shown in Figure 3B,C, respectively. The control, which contained no AAPH or TAVL, generated a clear image. After treatment with only AAPH, a fluorescence image was generated, suggesting that generation of ROS and lipid peroxidation had taken place in zebrafish embryos in the presence of AAPH. However, when zebrafish embryos were treated with TAVL prior to AAPH treatment, dose-dependent reductions in the generation of ROS and lipid peroxidation were observed. These results demonstrated that the antioxidant peptide (TAVL) could have a protective effect against oxidative stress through its antioxidant activity.

**Figure 3.** Protective effects of antioxidant peptide (TAVL) against AAPH-induced oxidative stress in zebrafish model. (**A**) Protective effect of TAVL on AAPH-induced cell death in zebrafish embryos. Cell death levels were measured after staining with acridine orange followed by image analysis and fluorescence microscopy. (**B**) Inhibitory effect of TAVL on AAPH-induced ROS production in zebrafish

embryos. ROS levels were measured after staining with 2 ,7 -dichlorodihydrofluorescein diacetae (DCF-DA) followed by image analysis and fluorescence microscopy. (**C**) Inhibitory effect of TAVL on AAPH-induced lipid peroxidation in zebrafish. Lipid peroxidation levels were by DPPP staining. The fluorescence intensity of individual zebrafish was quantified using Image J program. Values are expressed as mean ± S.E. Significant differences from only AAPH-treated group (positive control) were identified at \* *p* < 0.05, \*\* *p* < 0.01 as analyzed by Duncan's multiple range test. The control group represents the negative control that does not receive treatment of AAPH and sample in an experiment.

#### **3. Discussion**

Velvet antler is rich in proteins that may account for 60% (*w*/*w*) of dry matter [19]. However, velvet antler proteins have only received limited attention as a potential bioactive resource. Recently, the proteases have been successfully applied to extraction bioactive compounds from Velvet antler. Several studies have also reported that enzymatic hydrolysate extracted from velvet antler using protease show a variety of biological benefits, including anti-obesity, anti-inflammatory, and antioxidant effects [12–15]. However, there have been no reports about antioxidant peptides that can be extracted from velvet antler by enzymatic hydrolysis. Therefore, the aim of this study was to purify and identify antioxidant peptides from velvet antler enzymatic hydrolysates and to evaluate their antioxidant properties using peroxyl radicals scavenging assay. Protective effects of the purified peptide against AAPH-induced oxidative stress in Chang liver cells and in zebrafish model in vivo were also determined.

To obtain novel active antioxidant peptides, five commercial proteases were used under optimal conditions to hydrolyze velvet antler. Alcalase hydrolysate showed the highest peroxyl radical scavenging activity. In addition, several studies have suggested that Alcalase is useful for the production of bioactive peptide from food proteins [16–18,20,21]. Moreover, Alcalase can produce shorter peptide sequences and terminal amino acid sequences responsible for various bioactivities [16,18]. Thus, Alcalase hydrolysate was selected to identify antioxidant peptide for further studies.

The molecular weight of peptide is an important factor for its function [22]. Ultrafiltration (UF) is a simple and efficient technology for separating different molecular weights of molecules based on their molecular weights [23]. In this study, the Alcalase-proteolytic hydrolysate of velvet antler was separated to three fractions with different molecular weight (MW < 5 kDa, MW of 5–10 kDa, and MW > 10 kDa) by an UF system. Among these three MW groups, the <5 kDa fraction showed the strongest peroxyl radical scavenging activity. Previous reports have found that food proteins hydrolysates can be separated into three fractions (>10 kDa, 5–10 kDa and <5 kDa) by UF according to MW and that the <5 kDa fraction exhibits the strongest free radical scavenging activity [6,17,24]. Results of the present study also demonstrated that low MW fraction of Alcalase-proteolytic hydrolysate had higher free radical scavenging activity than higher MW fractions. Therefore, the <5 kDa fraction was selected for purification and identification of antioxidant peptide.

Sequential chromatography was used to purify antioxidant peptide from the active <5 kDa fraction, including Sephadex G-25 column gel filtration chromatography and RP-HPLC. After two-step isolation, we finally obtained the purified active peptide. Its amino acid sequence was determined with a Q-TOF ESI mass spectrometer. The purified antioxidant peptide was identified as a tetrapeptide Trp-Asp-Val-Lys (TAVL). The antioxidant peptide's properties are based on its molecular weight, amino acid sequence, and composition [25]. As reported previously, peptides showing antioxidant activities with a lower molecular weight can more easily pass the intestinal barrier and exert biological effects [25–27]. The purified tetrapeptide Trp-Asp-Val-Lys in the present study had a low molecular weight of 547.29 Da. It showed good antioxidant activities in the present experiments, in agreement with previous reports. In addition, the composition of amino acids within sequences of the peptide is another important factor for its antioxidant effects [4]. Hydrophobic amino acids, including Trp, Pro, Tyr, Lys, Leu, Val, and His, play an important role in the radical scavenging effects of peptides [26]. In addition, it has been reported and proven that antioxidant peptides containing aromatic amino acid residues (Trp and Tyr) have strong antioxidative capacities because they can make active oxygen stable

through direct electron transfer [1,4]. In this study, the identified tetrapeptide has three amino acid residues (Trp, Val, and Lys) responsible for it antioxidant activity. This represents 3/4 of its compositions.

Overproduction of ROS can induce oxidative stress, which can cause numerous diseases and disorders. Also, cellular damage by ROS-induced oxidative stress often impairs biomolecules function and leads to cell death [28]. AAPH is a free radical–generating compound widely used to mimic the oxidative stress state [24,29,30]. Hence, in order to assess the intracellular antioxidant activity of the purified antioxidant peptide, in this study, AAPH was used to induce oxidative stress. The level of ROS production in cells was detected via oxidant sensitive fluorescent probe DCFH-DA to measure whether purified antioxidant peptide could prevent AAPH-induced ROS generation and the resulting oxidative stressors. Our results showed that treatment of Chang liver cells with AAPH significantly increased intracellular ROS level. However, purified antioxidant peptide inhibited such ROS generation induced by AAPH. AAPH generates free radicals through reacting with oxygen, which resulting in rapid formation of peroxyl radicals. The presently demonstrated inhibitory action of purified antioxidant peptide on ROS production can be attributed to its peroxyl radical scavenging activity. This purified antioxidant peptide was evaluated further with regard to its protective effects against AAPH-induced cellar damage. Exposure of cells to AAPH resulted in a significant decrease of cell viability. However, treatment with the purified antioxidant peptide inhibited cell death, suggesting that the purified antioxidant peptide could protect cells against AAPH-induced cytotoxicity. These results suggest that the purified antioxidant peptide could have a protective effect against ROS-induced oxidative stress, thus leading to reduced cellular injuries.

Recent reports indicated that zebrafish can be used as a rapid and simple model to assess the antioxidant activity against oxidative stress in vivo [29,30]. Therefore, in the present study, we investigated the antioxidant effect of purified antioxidant peptide in vivo using the zebrafish model. In the current study, antioxidant effects of purified antioxidant peptide against AAPH-induced oxidative stress in zebrafish model were investigated. Our results showed that treating zebrafish embryos with AAPH-treatment significantly increased cell death and ROS levels. However, the purified antioxidant peptide inhibited such AAPH-induced cell death and ROS generation. Lipid peroxidation may be a form of free radical–caused cellular damage [31]. In the present study, lipid peroxidation significantly increased by AAPH treatment in zebrafish embryos. However, the purified antioxidant peptide inhibited such lipid peroxidation formation effectively. The protective effect of the purified antioxidant peptide against lipid peroxidation formation can be attributed to its antiperoxidative effect. Taken together, these results further support that the purified antioxidant peptide could be utilized as a natural antioxidant to potentially protect cells against oxidative stress.

In conclusion, an antioxidant tetrapeptide was purified and identified from Alcalase-proteolytic hydrolysate of velvet antler. The identified antioxidant peptide (Trp-Asp-Val-Lys, 547.29 Da) exhibited great antioxidant activity based on peroxyl radical scavenging assay. In addition, this tetrapeptide significantly inhibited AAPH-induced ROS production in Chang liver cells and in a zebrafish model in vivo. These results demonstrate that ROS reduction by tetrapeptide may contribute to attenuation of intracellular oxidative stress. This tetrapeptide could have potential applications in functional foods, nutraceutical, and pharmaceutical industries.

#### **4. Materials and Methods**

#### *4.1. Chemicals and Reagents*

2,2-Azobis-(2-amidinopropane) dihydrochloride (AAPH) and a-(4-pyridyl-1-oxide)-Nt-butylnitrone (4-POBN) were purchased from Sigma Chemical Co. (St. Louis, MO, USA). Protein proteases including pepsin, trypsin, and α-chymotrypsin were purchased from Sigma-Aldrich (St. Louis, MO, USA). Neutrase and Alcalase were purchased from Novozyme Co. (Novo Nordisk, Bagsvaerd, Denmark). Penicillin-streptomycin and trypsin-EDTA were purchased from Gibco-BRL (Burlington, ON, Canada). 2 ,7 -dichlorodihydrofluorescein diacetae (DCFH-DA)

and 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) were obtained from Sigma-Aldrich (St. Louis, MO, USA). All other chemicals and reagents used were of analytical grade and obtained from commercial sources.

#### *4.2. Sample Preparation*

Velvet antler was obtained from a farmed elk Daesungsan Deer Farm (Daegwallyeong, Korea) at 75 days after casting. The fresh velvet antler was immediately sliced and lyophilized. The lyophilized velvet antler was ground into a fine powder and stored at −20 ◦C until use.

#### *4.3. Preparation of Enzymatic Hydrolysates from Velvet Antler*

Enzymatic hydrolysis was performed using five commercial proteases (trypsin, pepsin, α-chymotrypsin, Neutrase, and Alcalase) at their optimal conditions (pH and temperature) as described previously [24]. Briefly, one gram of dried velvet antler powder was added into 100 mL of distilled water. Each enzyme was then added to have a substrate to enzyme ratio of 100:1. Enzymatic hydrolysis was conducted under optimal conditions for 24 h, after which the hydrolysate was boiled at 100 ◦C for 10 min to inactivate the enzyme. These hydrolysates were clarified by centrifugation at 3000× *g* for 20 min to remove any unhydrolyzed residue. The supernatant of each hydrolysate was filtered, adjusted to pH 7.0, and stored for subsequent use in experiments.

#### *4.4. Peroxyl Radical Scavenging Activity*

Peroxyl radicals were generated by AAPH and their scavenging activities were measured using an electron spin resonance (ESR) spectrometer (JEOL, Tokyo, Japan) in accordance with the method described by Hiramoto et al. [32]. Briefly, 20 μL of 40 mM AAPH and 20 μL of 40 mM 4-POBN were mixed with 20 μL of PBS and 20 μL of indicated concentration of tested sample. The mixture solution was incubated at 37 ◦C in a water bath for 30 min and then transferred into a capillary tube. Experimental conditions were as follows: power, 10 mW; amplitude, 1 × 1000; modulation width, 0.2 mT; sweep width, 10 mT; sweep time, 30 s; and time constant, 0.03 s.

#### *4.5. Isolation of Antioxidant Peptides from the Enzymatic Hydrolysate of Velvet Antler*

#### 4.5.1. Fractionation According to the Molecular Weight

The enzymatic hydrolysate, which possess the highest peroxyl radical scavenging activity, was fractionated using the Millipore's Lab scale TFF system (Millipore Corporation, Bedford, MA, USA) equated with ultrafiltration membranes (MWCO: 5 and 10 kDa) at 4 ◦C. Then, three fractions (>10 kDa, 5–10 kDa, and <5 kDa) were obtained.

#### 4.5.2. Purification of Antioxidant Peptides

The target fraction (500 mg) was loaded onto a Sephadex G-25 column (2.5 × 100 cm) pre-equilibrated with filtered distilled water. Elution was then carried out with filtered distilled water at a flow rate of 1.5 mL/min. Absorbance of each fraction at 220 nm was read and the sub-fractions were collected. The fraction with the highest peroxyl radical scavenging activity obtained was then subjected to reverse-phase high performance liquid chromatography (RP-HPLC) on an Atlantis T3 column (3 μm, 3.0 × 150 mm, Waters, NY, USA) with a linear gradient of acetonitrile (0–100% *v*/*v*, 30 min) at a flow rate of 1.0 mL/min. Elution peaks were detected at 220 nm.

#### 4.5.3. Identification of Purified Antioxidant Peptide

Molecular weight and amino acid sequences of antioxidant peptides purified from velvet antler were determined using a MicroQ-TOFIII mass spectrometer (Bruker Daltonics, Hamburg, Germany) coupled with electrospray ionization (ESI) source. The purified peptide was dissolved in distilled

water and infused into the ESI source. Its molecular weight was determined by singly charged (M + H) state analysis in mass spectrum.

#### *4.6. Experiments for Antioxidant Activity Assay Using Cells*

#### 4.6.1. Cell Culture

The human hepatocyte–derived cell line termed Chang Liver were obtained from American Type Culture Collection (ATCC, Manassas, VA, USA) and it is well-known cell line used in various biological activities experiments including cellular antioxidant activity. Chang liver cells were cultured in Dulbecco's modified Eagle's medium (DMEM, Gibco-BRL, Burlington, ON, Canada) supplemented with 10% (*v*/*v*) heat-inactivated bovin serum (FBS, Gibco-BRL, Burlington, ON, Canada) and 1% (*v*/*v*) antibiotic. Cultures were maintained at 37 ◦C in a 5% CO2 incubator.

#### 4.6.2. Measuring Cytoprotective Effect by MTT Assay

Cytoprotective effect of the purified peptide was determined by a colorimetric MTT assay using Chang liver cells. Briefly, cells were seeded into a 96-well culture plates at cell density of 1 <sup>×</sup> <sup>10</sup><sup>5</sup> cells/mL. After incubation for 16 h, cells were treated with various concentrations (25, 50, 100, 200, 400, 800, and 1600 μg/mL) of purified peptide. One hour later, 15 mM of AAPH was added to each well. Cells were then incubated for an additional 24 h at 37 ◦C. After incubation, 50 μL of MTT solution (stock concentration: 5 mg/mL in DPBS) was added into each well, and cells were incubated at 37 ◦C for 4 h. Supernatants were aspirated and formazan crystals in each well were dissolved in DMSO. Absorbance at 540 nm was then measured.

#### 4.6.3. Intracellular ROS Measurement

To detect levels of intracellular ROS, the DCFH-DA method was used as described previously [33]. Briefly, Chang liver cells were seeded into 96-well culture plates at cell density of 1 <sup>×</sup> 105 cells/mL. After 16 h, cells were treated with various concentrations of purified peptide and then incubated at 37 ◦C. One hour later, 15 mM of AAPH was added to the culture. Cells were then incubated for an additional 30 min at 37 ◦C. DCFH-DA solution (5 μg/mL) was then introduced to cells. DCF-DA fluorescence was detected at an excitation wavelength of 485 nm and an emission wavelength of 535 nm using a Perkin-Elmer LS-5B spectrofluorometer.

#### *4.7. In Vivo Zebrafish Model for Antioxidant Activity Assay*

The adult zebrafish were maintained following our previous study [30,34]. At 7–9 h post-fertilization (hpf), zebrafish embryos were collected and arrayed in a 12-well plate (15 embryos/well) containing 2 mL embryo medium. The embryos were incubated with or without purified peptide for 1 h and then exposed to AAPH (15 mM) for 24 hpf. Thereafter, zebrafish embryos were transferred into fresh embryo medium and allowed to develop up to 72 hpf. Cell death, intracellular ROS, and lipid peroxidation in zebrafish were estimated according to previously reported methods [33,34]. Briefly, at 72 hpf, zebrafish embryos were transferred into 24-well plates and separately stained with specific fluorescent probe dyes to determine cell death (acridine orange), intracellular ROS (2 ,7 -dichlorodihydrofluorescein diacetate (DCFH-DA)), and lipid peroxidation generation (diphenyl-1-pyrenylphosphine (DPPP). Following incubation for a specified period in the dye-containing media, embryos were rinsed with fresh embryo media, anesthetized, and then observed under a fluorescence microscope equipped with a CoolSNAP-Pro color digital camera (Olympus, Tokyo, Japan). The fluorescence intensities of individual zebrafish were quantified using Image J 1.46r software (Wayne Rasband, National Institutes of Health, Bethesda, MD, USA). Cell death, intracellular ROS, and lipid peroxidation generation were calculated by comparing fluorescence intensities of treated embryos to those of controls.

#### *4.8. Statistical Analysis*

Data are presented as means ± standard error (SE). Statistical comparisons of mean values were performed by analysis of variance (ANOVA) followed by a Duncan's multiple range test using SPSS software.

**Author Contributions:** Conceptualization, S.-H.M. and S.-H.L.; Formal analysis, Y.D., S.-C.K. and S.-H.L.; Investigation, Y.D. and S.-C.K.; Writing – original draft, Y.D. and S.-H.L.

**Acknowledgments:** This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07046262) and was supported by the Soonchunhyang University Research Fund.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Hydrolysed Collagen from Sheepskins as a Source of Functional Peptides with Antioxidant Activity**

#### **Arely León-López 1, Lucía Fuentes-Jiménez 2, Alma Delia Hernández-Fuentes 1, Rafael G. Campos-Montiel <sup>1</sup> and Gabriel Aguirre-Álvarez 1,\***


Received: 5 July 2019; Accepted: 29 July 2019; Published: 13 August 2019

**Abstract:** The extraction and enzymatic hydrolysis of collagen from sheepskins at different times of hydrolysis (0, 10, 15, 20, 30 min, 1, 2, 3 and 4 h) were investigated in terms of amino acid content (hydroxyproline), isoelectric point, molecular weight (Mw) by sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) method, viscosity, Fourier-transform infrared (FTIR) spectroscopy, antioxidant capacity by 2,2 -azino-bis(3-ethylbenzothiazoline-6-sulphonic acid) (ABTS) and 2,2-diphenyl-1-picrylhydrazyl (DPPH) assays, thermal properties (Differential Scanning Calorimetry) and morphology by scanning electron microscopy (SEM) technique. The kinetics of hydrolysis showed an increase in the protein and hydroxyproline concentration as the hydrolysis time increased to 4 h. FTIR spectra allowed us to identify the functional groups of hydrolysed collagen (HC) in the amide I region for collagen. The isoelectric point shifted to lower values compared to the native collagen precursor. The change in molecular weight and viscosity from time 0 min to 4 h promoted important antioxidant activity in the resulting HC. The lower the Mw, the greater the ability to donate an electron or hydrogen to stabilize radicals. From the SEM images it was evident that HC after 2 h had a porous and spongy structure. These results suggest that HC could be a good alternative to replace HC from typical sources like pigs, cows and fish.

**Keywords:** collagen; hydrolysis; enzyme; molecular weight; sheepskin

#### **1. Introduction**

Collagen is the most abundant protein in bones and connective tissue in vertebrates, and there are at least 29 types. They are different in terms of their amino acid sequence and composition, the function in the organism, and the structure [1,2]. The structure of collagen is a triple helix formed for 3 α chains Gly-X-Y, where X is proline, Y is mainly hydroxyproline, and the triple helix is stabilized for hydrogen bonds with continuous repetition of the Gly-X-Y depending on the collagen type [3]. Collagen can be extracted from the skins, bones, tendons and cartilages of pigs [4], cows [5], marine organisms [6–8] and rabbits [9]. Hydrolysed collagen (HC) refers to a group of peptides that results from the proteolysis of native collagen type 1; its molecular weight (Mw) varies from 0.3 to 8 KDa [10]. It does not jellify in solution at room temperature and is soluble in cold water, so it can mix easily with other products [11–13]. Additionally, hydrolysed collagen has a neutral smell, is colourless, and can be used in emulsions as a stabilizer. It is widely used in the pharmaceutical industry for the treatment of diseases like osteoarthritis and osteoporosis. Also, in the cosmetics and food industries it is applied for the preparation of fruity beverages and nutritional supplements [14–17].

The antioxidant activity is the capacity of a substance to inhibit oxidative degradation by reacting with free radicals. There are natural antioxidants such as HC that exhibit mechanisms to exert antioxidant activity hydrogen transfer or electron donation [18]. The antioxidant activity of hydrolysed collagen is generally associated with the molecular weight. Peptides with 2 to 10 amino acid residues have a molecular weight of around 10 KDa. They show high radical scavenging because of their accessibility to active radicals. The amino acid content as well as the Mw of HC are properties closely related to the antioxidant activity [19,20].

Several studies on antioxidant activity have been conducted with HC from different sources such as pigs [4,21], cows [22], fish [23–25], and invertebrates like jellyfishes or sponges [26,27]. However, less is known about the properties of hydrolysed collagen extracted from ovine sources and its possible applications. The objective of this research is the extraction and hydrolysis of collagen from sheepskins to establish the antioxidant activity as well as the physicochemical properties of the obtained peptides as a function of the hydrolysis time. These results could be of interest in developing an alternative source to fish, cows and pigs.

#### **2. Results and Discussion**

#### *2.1. Protein Content*

From Figure 1, it can be seen that the protein concentration of HC was affected by the hydrolysis time. Before hydrolysis (at 0 min), the protein concentration was 1.61 mg/mL. However, in the first 20 min the lowest concentration was reported, up to approximately 1.1 mg/mL. The concentration remained constant around 1.21 mg/mL and there were no statistical differences (*p* > 0.05) from 30 min treatment to the end of the experiment (4 h). This behaviour could be due to the decrease in available substrate and/or enzyme autodigestion [28]. All the treatments in this experiment reported higher protein concentrations compared to those reported by Paul and co-workers [29]. They obtained hydrolysed collagen from cowhide using an enzymatic treatment. The protein concentration was reported with 0.11 mg/mL. Other works [30] also obtained a low protein concentration (around 0.76 mg/mL) from chicken connective tissue with enzymatic treatment at pH 7.5. These results suggest that hydrolysis of collagen from sheepskin under the conditions described above was efficient compared to chicken and bovine sources.

**Figure 1.** Protein concentration for different hydrolysis times in ovine collagen. Different letters represent the average of three replicates and indicate significant difference at *p* ≤ 0.05.

#### *2.2. Hydroxyproline Content*

Collagen is different from other proteins due to its high concentration of hydroxyproline. This amino acid provides thermal stability to collagen molecules because of the hydrogen bond formation and the presence of an hydroxyl group (OH), limiting the rotation of the peptide chain [15,31]. The hydrolysis of collagen reported in Figure 2 indicates that the longer the hydrolysis time, the more hydroxyproline is obtained. After 4 h of enzymatic and thermal treatment, HC reported 24.47 mg/L. This trend agrees well with previous works carried out on fish bone gelatine [32] and pig collagen [33]. They found a significant increment in hydroxyproline content as the hydrolysis time increased. Also, their maximum yield of this amino acid was found at 4 h of thermal treatment. This increment in hydroxyproline content could be attributed to the hydrolysis of the polypeptide chain. These thermal and enzymatic treatments increased the detectable amount of hydroxyproline. This could be the case with some marine sources like Atlantic salmon skin [34] and bigeye snapper skin [35], for which values of 88.24 mg/mL and 87.75–90.86 mg/mL, respectively, were reported. However, Gómez-Lizárraga and co-workers [36] obtained 1.18 mg/L from bovine tendons.

**Figure 2.** Hydroxyproline concentration in hydrolysed collagen. Different letters indicate significant difference at *p* ≤ 0.05.

#### *2.3. Amino Acid Content in Hydrolysed Collagen as a Function of Hydrolysis Time*

The amino acid composition of hydrolysed collagen (HC) from sheepskin was very similar to other vertebrate collagen sources such as pigs [33], chicken [37], calves [38], cows [39] and fish [40]. Seventeen amino acids were identified and quantified as structural components of ovine collagen. These amino acids were monitored during the enzymatic hydrolysis process of collagen. Results after 1 h were not included in Table 1 because no significant differences (*p* > 0.05) were observed. After 1 m of hydrolysis, serine was detected as the major component (19.33 mg/g of protein). The enzymatic treatment of collagen showed significant increments in aspartic acid and glutamic acid due to enzymatic cleavage of the polypeptide chains of collagen fibres [41]. These amino acids increased their concentrations considerably as a result of the deamidation process of asparagine and glutamine, respectively [42]. Some amino acids such as lysine, proline, cysteine, tyrosine, valine, methionine, isoleucine, leucine and phenylalanine were sensitive to hydrolysis and their concentrations decreased considerably. The same behaviour was observed with HC from fish skin [43].


**Table 1.** Amino acid content (mg of amino acid/g of protein) of hydrolysed collagen as a function of hydrolysis time.

Results are mean values of three replicates' SD. Values followed by different letters are significantly different according to Tukey's test (*p* ≤ 0.05).

#### *2.4. Isoelectric Point*

Isoelectric point (pI) is the pH of the collagen molecule at 0 charge. Looking at Figure 3, the pI shifted from 4.61 to 3.68 at the end of the hydrolysis (4 h). Native collagen (0 min) reported a pI value of about 4.7. Similar values of pI for acid (4.9) and pepsin-soluble (5.7) collagen were reported in the literature on the extraction of collagen from ovine bones [1]. Hydrolysed collagen (HC) is an amphoteric macromolecule composed of both acidic (COOH) and basic (NH3) functional groups and the pI decrement could be due to the deamination process [10]. When HC was treated at high temperature, the asparagine groups transformed to aspartic acid and the glutamine groups into glutamic acid [42]. This leads to a loss of amino groups and a large relative increase in the carboxyl groups, or a higher content of acidic amino acids, which become dominant, shifting the pI to lower values [44,45]. Collagen is an amphoteric macromolecule that possesses different pI according to the hydrolysis time. The higher the pI, the higher the viscosity observed due to stronger electrostatic repulsions between collagen chains [46].

#### *2.5. Molecular Weight and Viscosity*

Hydrolysis of collagen is characterized by a reduction in its molecular weight (Mw). Changes in collagen Mw were monitored by SDS-PAGE methodology. Figure 4 shows that native collagen (0 min) reflected the highest Mw with 260.33 KDa. When the hydrolysis started, the Mw dropped to lower values. The first significant changes (*p* ≤ 0.05) were observed at 15 min and 20 min with 160.67 KDa and 138.89 KDa, respectively. However, there was a massive decrement in Mw after treatment for 2 h (15.20 KDa). After this time, no significant changes in Mw were registered (*p* ≥ 0.05) up to 4 h with 5.62 KDa. These results are in good agreement with those reported in the literature with values between 3 and 6 KDa [11,13]. Chi and co-workers [47] used trypsin for digestion to obtain fish hydrolysed collagen with Mw around 14 KDa. Also, hydrolysed collagen from turkey byproducts was obtained with Mw of 34 KDa by using different enzymes [48]. Previous works carried out on Alaska Pollack skin [43] and sea cucumber [49] found high antioxidative properties in HC with Mw around 6-8 KDa and 5 KDa, respectively. There is a close relationship between the Mw and the viscosity of the collagen [12]. At 0 min, the viscosity reported 6800 Cp. This higher viscosity could be attributed to the presence of high molecular weight and chains [50]. The viscosity decreased upon heating, in

accordance with the hydrolysis time. It shifted to 0.5 Cp when the hydrolysis had been going on for 1 h. After this time, no significant changes (*p* ≥ 0.05) were observed. The triple helix structure of native collagen was changed to a random coil form due to the dissociation of the hydrogen bonds [51].

**Figure 3.** Isoelectric point of ovine collagen hydrolysates as a function of hydrolysis time. Different letters indicate significant difference at *p* ≤ 0.05.

**+\GURO\VLVWLPH**

**Figure 4.** Co-relation between molecular weight and viscosity during the hydrolysis of ovine collagen. The different letters indicate significant difference at *p* ≤ 0.05.

#### *2.6. Fourier Transform-Infrared Spectroscopy (FTIR)*

The FTIR spectra of both native collagen (0 min) and hydrolysed collagen (HC) were produced in the range of 600-4000 cm<sup>−</sup>1. All FTIR spectra of HC samples overlapped each other. However, for reasons of clarity, Figure 5 only shows the range of 1000-3500 cm−<sup>1</sup> for the samples at 0 min and 4 h. There were no changes in peak location for the amide bands between the control and the treated sample. However, the magnitude of amplitude in HC decreased significantly. Amide I at wavelength 1641 cm−<sup>1</sup> was interpreted as the stretching vibrations of the carbonyl groups (C=O) along the polypeptide backbone. This band is characteristic of α-helix chains and is widely used to analyse the secondary structure of collagen [52]. Amide II was detected at 1548 cm−<sup>1</sup> for the stretching vibrations of the CN group. Amide III was mainly associated with intermolecular interactions at 1248 cm<sup>−</sup>1, representing the stretching vibrations of the C-N group and the deformation of the NH group from amide bonds [1]. The amide B (2946 cm<sup>−</sup>1) and Amide A (3295 cm−1) bands were related to the asymmetric stretching of the CH2 groups and vibrations of tension of the NH group, respectively. The longer the hydrolysis time, the higher the vibrations of OH groups (1037 cm<sup>−</sup>1) reported in the spectra. These results agree very well with the literature [7], suggesting that the HC (1 h) maintained the same characteristics of native collagen (0 min) as the peak locations of all amide bands scarcely caused changes.

**Figure 5.** Fourier transform infrared spectra of collagen hydrolysates from 0 min, and 1 h of hydrolysis time.

#### *2.7. Antioxidant Activity*

2,2 -azino-bis(3-ethylbenzothiazoline-6-sulphonic acid) (ABTS) and 2,2-diphenyl-1-picrylhydrazyl (DPPH) assays are methods frequently used in the evaluation of radical scavengers to assess the antioxidant capacity of compounds. The ABTS radical can be applied in a wide range of pH, and is soluble in aqueous and organic media. It allows for the evaluation of both hydrophilic and lipophilic antioxidants [53,54]. DPPH radical is one of the most stable free radicals. It is a simple and quick method that can be used to test the ability of compounds to act as free radical scavengers or hydrogen donors [55]. The antioxidant activity of hydrolysed collagen (HC) was evaluated by the ABTS and DPPH methods as shown in Figure 6. There were differences (*p* < 0.05) between the ABTS and DPPH radical scavenging activities. The highest ABTS and DPPH radical scavenging activity was found at 4 h of hydrolysis with 67.6% and 52.75%, respectively. The ABTS technique reported higher values compared to DPPH. ABTS radical scavenging is commonly used to evaluate the ability of antioxidants to donate an electron or hydrogen atom to stabilize radicals [56]. The antioxidant activity of protein hydrolysates seemed to be affected by the amino acid composition as well as the degree of hydrolysis [33]. The longer the time, the higher the antioxidative activity was observed. It is well known that several amino acids like tyrosine, histidine [57] and lysine possess antioxidant properties [43]. Also, some hydrophobic amino acids like isoleucine and methionine could donate electrons or hydrogen, converting the radical to a more stable species and contributing to higher radical scavenging [58]. The amino acid content results of this research showed that strong hydrolysis (4 h) of collagen from sheepskin increased the concentration of these amino acids. At this time (4 h), the radical scavenging activity increased significantly because there was an increment of glutamic acid from 7.99 to 16.58 mg/g of protein. On the other hand, the enzymatic treatment of native collagen decreased its Mw by around 6 KDa. Previous works carried out with different sources such as fish [6,8] and squid [59] showed that Mw was one of the most important parameters that determined the biological activity of collagen [60]. The lower the Mw polypeptides, the higher the antioxidant activity was found to be. These results suggested that hydrolysis of collagen generated a wide variety of smaller peptides and free amino acids depending on the hydrolysis time [61]. Therefore, the composition of amino acid content, degree of hydrolysis and size of collagen chains and source of raw material could define the antioxidant capacity of the HC.

+\GURO\VLVWLPH

**Figure 6.** Antioxidant capacity of ovine collagen during hydrolysis. The activity was evaluated with TBS and DPPH methods. Values are expressed as the mean ± SD (*n* = 3). Different letters indicate significant difference at *p* ≤ 0.05.

#### *2.8. Thermal Properties*

The thermal properties of hydrolysed collagen was evaluated in dried powder with 0% of water content (d.b.) from samples obtained after different hydrolysis times. From Table 2, it can be seen that the melting temperature (Tm) reduced gradually from 153.3 ◦C (0 min) to 136.9 ◦C as the hydrolysis time increased up to 4 h. This reduction in Tm suggested that the thermal stability of the triple helical structure of collagen was affected after 1 h of thermal and enzymatic treatment (137.2 ◦C). It is well known that intermolecular helix formation is dependent on the molecular weight (Mw) of alpha chains [62]. The higher the Mw, the higher the Tm values that will be observed. The Tm results suggested that intermolecular helix formation reduced significantly as hydrolysis took place. This reduction originated at the bimolecular nucleation stage, which involved an intramolecular β-turn facilitated by glycine and/or proline residues [63]. This means that propagation was far less effective compared with native collagen at 0 min of treatment. The DSC thermograms of this native collagen showed a narrow melting range, suggesting a more homogenous population of longer helical segments. However, as the hydrolysis process started, the samples showed a broad melting range, indicating a wide molecular weight distribution of helix lengths by shifting the Tm towards lower values.

**Table 2.** Thermal properties of dried hydrolysed collagen powder at different times of hydrolysis. Average value of three replicates. Values followed by different letters are significantly different according to Tukey's test (*p* ≤ 0.05).


Considering the enthalpy as the energy required to disorganize the helical structure, it was possible to assume that native collagen possessed a more ordered structure (20.06 J/g). However, hydrolysis treatment produced low molecular weight residues with a low possibility of intramolecular refolding. In fact, previous work [63] has suggested a limit of 40-80 amino acid residues as the critical size of nuclei for renaturation. Also, we have seen that samples with Mw < 15 KDa cannot recover their helical conformation even at high concentrations [64]. Enthalpy of samples with 4 h of hydrolysis showed significant differences (*p* < 0.05) with the lowest degree of reorganization (8.93 J/g). However, it still showed some energy requirements to disorganize its structure. This structural conformation could be due to the intermolecular interaction of two or three strands with low Mw [64].

#### *2.9. SEM Images*

The morphological appearance of native collagen and their resulting hydrolysates are shown in Figure 7. It can be seen that HC showed changes in morphology across the different hydrolysis times. During the first 20 min of hydrolysis, the collagen did not appear to have pores in its structure (Figure 7a-c). However, after 30 min of treatment (Figure 7d-f), initial degradation of collagen was seen in the form of small pores in the protein structure, the result of enzymatic action leading to a partial disassembly of fibres into fibrils, and therefore, the generation of low molecular weight polypeptides.

**Figure 7.** HC morphology changes during the hydrolysis alkaline treatment: (**a**) 0 min, (**b**) 10 min, (**c**) 15 min, (**d**) 20 min, (**e**) 30 min, (**f**) 1 h, (**g**) 2 h, (**h**) 3 h, (**i**) 4 h.

After 2 h of enzymatic hydrolysis (Figure 7h,i), the disaggregation of collagen fibrils within the collagen fibres was evidence of the autolysis of the enzymatic treatment [7]. The collagen structure of these treatments (after 3 h and 4 h) appeared to be extremely degraded, with a spongy and porous form. Previous works carried out with marine sources [65] discovered that the Mw of peptides influences the properties of HC. The lower the Mw, the more pores there are, and the more open the structure observed in the images.

#### *2.10. Relationship between Mw, Viscosity, Antioxidant Activity and Thermal Properties*

The properties of HC mainly appeared to obey the Mw of polypeptide chains obtained after hydrolysis. However, a strong relationship was observed with the other parameters evaluated; viscosity was closely related to Mw because these results suggested that the low Mw of polypeptide chains

produced a low hydrodynamic volume of collagen molecules in solution [42]. This could be why the viscosity dropped to close to 0 Cp after 1 h of hydrolysis. The antioxidant activity also appeared to be dependent on the Mw parameter. The results from this research showed that the lower the Mw, the higher the antioxidant activity of HC. This behaviour was confirmed by the higher concentrations of hydroxyproline after 1 h of hydrolysis. Additionally, higher concentrations of aspartic acid and glutamic acid were obtained over the same period of time. The measurement of thermal properties in HC indicated that low-Mw samples (2 h, 3 h and 4 h) resulted in the lowest enthalpy. This means that less energy was required to disorganize the structure of HC because its low Mw avoided the organization into a triple helical form [66].

#### **3. Materials and Methods**

Sheepskins with 40–50% water content were used in this experiment. They were obtained as byproducts from a local market in Tulancingo, Hidalgo, Mexico. Reactive grade acetic acid (99% purity), sodium chloride reactive grade, porcine digestive protease (pepsin), dialysis membrane tubing with 6–8 kDa molecular weight cut-off, 4-dimethylaminobenzaldehyde at 5%, 65%, perchloric acid, 0.006 M chloramine T, 0.8 M and citrate buffer were purchased from Sigma-Aldrich Corp. (MA, USA).

#### *3.1. Conditioning of The Skin before Collagen Extraction*

The sheepskins were soaked to recover up to 70% water, followed by a fleshing process to remove the connective tissue and fat. Then, the skins were shaved to remove most of the hair.

#### *3.2. Collagen Extraction from Ovine Skin*

The methodology of Chuaychan et al. [67] was used, with some modifications. Pre-treated sheepskin was cut into small squares of approximately 1 cm<sup>2</sup> and suspended in 0.5 M acetic acid solution at a ratio of 1:10 (*w*/*v*). The sample was placed in a shaker machine (Bellco Biotechnology. NJ, USA) at 140 rpm for 3 h at room temperature (20 ± 2 ◦C), followed by the addition of pepsin at a concentration of 1 g/L with gentle stirring for 48 h. The sample was filtered and precipitated with a solution of 2.6 M sodium chloride. The precipitated material was centrifuged for 15 min at a relative centrifugal force of 3380× *g* in a centrifuge model Z36HK (HERMLE Labortechnik GmbH, Wehingen, Germany).

#### *3.3. Hydrolysis of Collagen*

Precipitated collagen was re-suspended in a solution of 1 M NaCO3 at a ratio of 1:4 (*w*/*v*). The pH was adjusted to 8 ± 0.2. The hydrolysis of collagen was carried out with the enzyme trypsin at a concentration of 1:50 (*w*/*v*) in a water bath at 60 ◦C for different times, as follows: 10 min, 15 min, 20 min, 30 min, 1 h, 2 h, 3 h and 4 h. The control sample was called 0 min. All the samples were inactivated at 90 ◦C for 10 min and stored at 4 ◦C.

#### *3.4. Protein Determination*

Following Bradford determination [68], 5 mL of Bradford reagent and 100 μL of the sample were added and mixed in a vortex for 2 min. After 5 min storage in darkness, the sample was read in a spectrophotometer at 595 nm. The serum albumin at different concentrations was used to create a calibration curve.

#### *3.5. Isoelectric Point*

The isoelectric point was measured by using a Zetasizer nano ZS90 coupled to auto-titrator MPT-2. Laser Doppler and a DTS1070 cell (Worcestershire, UK) were used to determinate the electrophoretic mobility. HC was diluted in distilled water at 1:10. Different values of pH from 2 to 7 were obtained by using 0.5 M HCl and 0.75 M NaOH buffers, respectively.

#### *3.6. Hydroxyproline Quantification*

According to the AOAC methodology [69], 4 g of the sample and 30 mL of 3.5M sulphuric acid were placed in an oven at 105 ◦C for 12 h. The volume was adjusted to 500 mL with distilled water and filtered. Two millilitres of filtered sample were mixed with 1 mL of oxidant solution (0.006 M chloramine T in 0.8 M citrate buffer, pH 6.0) in a reaction tube. The volume was adjusted to 100 mL and stirred for 30 min at room temperature. Two millilitres of colour reagent (10 g of dimethylamine benzaldehyde in 35 mL of 65% perchloric acid) were added with stirring at 60 ◦C for 15 min. The absorbance of samples was measured against the blank at 558 nm in a UV-visible Jenway Genova, (Bibby Scientific, Staffordshire UK) spectrophotometer.

The collagen content was calculated with the next equation:

$$\% \, Hydrogengroline = \frac{(\text{Y})(2.5)}{(\text{W})(\text{V})} \,\text{}\tag{1}$$

where:

*Y* = Hydroxyproline concentration from the standard curve *W* = sample weight

*V* = volume in mL to adjust the 100 mL

#### *3.7. Viscosity Analysis*

Viscosity measurement was carried out using a viscometer Brookfield RTV (MA, USA) (spindle number: 5; speed of 100 rpm). Viscosity was expressed in centipoise (cP). All the samples were previously conditioned at 7 ◦C [70].

#### *3.8. Molecular Weight*

The sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) determination was performed according to Laemmli methodology [71]. One millilitre of dialyzed collagen was dissolved in 0.5M Tris–HCl buffer pH 6.8 (1% SDS, 10% glycerol and 0.01% bromophenol blue) and boiled for 5 min. Then, 10 μL of the denatured sample and 5 μL of a marker with a molecular weight from 10 kDa to 220 kDa (BenchMark Protein Ladder, Thermo Scientific, Pierce™, MA, USA) were loaded into wells at the top of the polyacrylamide gel. This gel contained a 4% stacking gel on top of the 12.5% resolving gel. A voltage of 50 V was applied for 30 min, and once the mobility of the proteins reached the resolving layer, the voltage was increased to 100 V for 4 h in order to see the separation of proteins according to size. As the electric current ran through the buffer, the negatively charged proteins migrated towards the anode and lower molecular weight proteins reached the bottom of the gel. After the migration of proteins, the gel was stained with Silver Stain Kit (Thermo Scientific, Pierce™ MA, USA).

#### *3.9. Fourier Transform-Infrared (FTIR) Spectroscopy*

The FTIR technique offers a green alternative because it allows us to quantify substances without organic solvents. The samples do not require any pretreatment, thus reducing the environmental damage caused by toxic waste. Also, it is a fast technique based on the natural vibrational frequencies of the chemical bonds present in molecules. FTIR is a non-destructive technique using a minimum amount of sample [72]. The absorption spectra by the FTIR technique were obtained with the Frontier FT-MIR (Perkin Elmer. MA, USA) equipment. The wavelength ranged from 380 to 4000 cm−<sup>1</sup> at room temperature. The samples were brought into intimate contact with the diamond crystal by applying a loading pressure. For each sample, the spectrum represented an average of four scans with 4 cm−<sup>1</sup> resolution. A spectrum of the empty cell was used as the background. Automatic signals were collected in 3620 scans at 1 cm−<sup>1</sup> resolution. All the data were processed with SpectrumTM 10 (Perkin Elmer. MA, USA) software.

#### *3.10. Di*ff*erential Scanning Calorimetry (DSC)*

Thermal properties of HC were obtained with DSC series Q 2000 with intracooler RCS90 (DE, USA). It was calibrated with indium (Tm, onset <sup>1</sup> <sup>4</sup> 156.6 8C, <sup>Δ</sup><sup>H</sup> <sup>1</sup> <sup>4</sup> 28.45 J/g). An average of 1.5 ± 0.1 mg of sample with known water content (0% db) was packed and hermetically sealed in a 50-mL stainless steel pan. An empty, hermetically sealed pan was used for a reference. Both heating and cooling scan rates were performed at 10 ◦C/min. Two heating scans were performed from 25 ◦C to 120 ◦C. Melting point temperature (Tm) and enthalpy (ΔH) were determined with TA 2000 analysis software (TA Instruments, DE, USA) based on the endothermic changes registered in the thermogram.

#### *3.11. Amino Acid Determination*

Amino acid content determination was performed according to Cohen [73] with some modifications. Three milligrams of freeze-dried HC were suspended in 6 M HCl and 1% *v*/*v* phenol at 150 ◦C for 1 h. Hydrolysed samples were dissolved in 2 mL of 0.5 M citrate buffer. Amino acid content was determined by high-performance liquid chromatography (HPLC) in a Hewlett Packard model GmbH (Winchester, UK) connected to a fluorescence detector (Ex. 250 Em. 395). The derivation reaction was carried out with 20 μL of the sample diluted in 60 μL buffer (borate buffer, Waters, Thermo Scientific, Pierce™, MA, USA) and 1 min stirring. Twenty microlitres of reagent AQC (Waters) were added with stirring for another 1 min, followed by the heating of the sample at 50 ◦C for 10 min. The amino acid separation was carried out in a Bluespher®column (100 × 2 mm ID) in reverse phase C18 octa-decyl dimethylsilane (Berlin, Germany). Conditions of work: mobile phase A: 50 mM sodium acetate, pH 5.75 and mobile phase B: 50 mM sodium acetate, pH 6/CAN 30:70 *v*/*v*, and 1 mL/min flow.

#### *3.12. Antioxidant Activity*

A solution of 2,2 -azino-bis(3-ethylbenzothiazoline-6-sulphonic acid) (ABTS) radical was prepared according to the literature [74] by mixing 7 mM ABTS and 2.45 mM potassium persulfate. After 16 h of stirring at room temperature in the dark, the ABTS solution was diluted with ethanol to stabilize it to 0.70 ± 0.02 at 734 nm. One millilitre of stabilized ABTS solution was mixed with 0.2 mL of the sample and the absorbance raised to 734 nm in a UV-visible Jenway Genova (Bibby Scientific, Staffordshire UK) spectrophotometer.

For assessing the antioxidant activity by 2,2-diphenyl-1-picrylhydrazyl (DPPH) radical inhibition [75], 0.5 mL of the sample was mixed with 2.5 mL of 6.1 <sup>×</sup> 10−<sup>5</sup> M methanolic radical DPPH solution and maintained in darkness for 30 min. The absorbance was measured at 515 nm in a UV-visible Jenway Genova spectrophotometer. The antioxidant activity for ABTS and DPPH radical inhibition was calculated via the following equation:

$$\% \text{ Inhibition} = \frac{\text{Initial absorber} - \text{Final absorbance}}{\text{Initial absorber}} \times 100\tag{2}$$

#### *3.13. Scanning Electron Microscopy (SEM)*

Morphology analysis was observed by a scanning electron microscope (Model S-2600N, HITACHI, Tokio, Japan). Freeze-dried HC samples were mounted on a strip of self-adhesive carbon paper and sputter-coated with gold to be observed in the scanning electron microscope at an acceleration voltage of 15 kV.

#### *3.14. Statistical Analysis*

A randomized design experiment and an analysis of variance (ANOVA) were applied to the experimental data, which included a Tukey test (*p* ≤ 0.05). Data were analysed with SPSS 16.0 software (SPSS Inc., Chicago, IL, USA). Three replicates per treatment were considered in this experiment.

#### **4. Conclusions**

The study demonstrated that sheepskins are a good source of hydrolysed collagen. The best results were seen after 2 h of hydrolysis treatment. From this point, the kinetics of native collagen hydrolysis produced polypeptides with a low molecular weight and viscosity. This reduction in the polypeptide chain size affected the thermal properties of HC as the 4 h treatment produced a lower enthalpy value. Also, the antioxidant properties of HC were enhanced as the hydrolysis time increased. The functional properties of HC could be controlled by the hydrolysis time and sheepskins appeared to be a good alternative to typical sources like pigs, cows and fish.

**Author Contributions:** Conceptualization, R.G.C.-M and G.A.-Á.; Data curation, A.D.H.-F and R.G.C.-M.; Formal analysis, A.D.H.-F, R.G.C.-M and G.A-A; Investigation, A.L.-L and L.F.-J; Methodology, A.L.-L; Supervision, G.A.-A; Writing—original draft, A.L.-L; Writing—review & editing, G.A.-A.

**Funding:** This research was funded by CONACyT, grant number 621400.

**Acknowledgments:** The first author gratefully acknowledges Dimitrios Zeugolis for his technical support during a research stay at the University of Galway, Ireland.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Development of a Soy Protein Hydrolysate with an Antihypertensive Effect**

**Eric Banan-Mwine Daliri 1, Fred Kwame Ofosu 1, Ramachandran Chelliah 1, Mi Houn Park 2, Jong-Hak Kim <sup>2</sup> and Deog-Hwan Oh 1,\***


Received: 15 February 2019; Accepted: 21 March 2019; Published: 25 March 2019

**Abstract:** In this study, we combined enzymatic hydrolysis and lactic acid fermentation to generate an antihypertensive product. Soybean protein isolates were first hydrolyzed by Prozyme and subsequently fermented with *Lactobacillus rhamnosus* EBD1. After fermentation, the in vitro angiotensin-converting enzyme (ACE) inhibitory activity of the product (P-SPI) increased from 60.8 ± 2.0% to 88.24 ± 3.2%, while captopril (a positive control) had an inhibitory activity of 94.20 ± 5.4%. Mass spectrometry revealed the presence of three potent and abundant ACE inhibitory peptides, PPNNNPASPSFSSSS, GPKALPII, and IIRCTGC in P-SPI. Hydrolyzing P-SPI with gastrointestinal proteases did not significantly affect its ACE inhibitory ability. Also, oral administration of P-SPI (200 mg/kg body weight) to spontaneous hypertensive rats (SHRs) for 6 weeks significantly lowered systolic blood pressure (−19 ± 4 mm Hg, *p* < 0.05) and controlled body weight gain relative to control SHRs that were fed with physiological saline. Overall, P-SPI could be used as an antihypertensive functional food.

**Keywords:** antihypertensive peptides; functional food; food-derived; fermentation

#### **1. Introduction**

High blood pressure (hypertension) is a chronic degenerative disease and the leading risk factor for chronic kidney disease and cardiovascular diseases [1]. Uncontrolled hypertension can result in chronic damage to the vascular system, myocardial strokes, and even death [2]. For this reason, several pharmacological and nonpharmacological strategies aimed at reducing the incidence of this disease have been implemented. Over the years, several functional foods have been developed from different food proteins to be used as nonpharmacological treatments of high blood pressure [3]. More so, many food-derived bioactive peptides have demonstrated antihypertensive effects. Such peptides commonly inhibit angiotensin-converting enzyme (ACE) activity and/ or reduce renin activity [4,5]. Soybean proteins are among the most common plant substrates used for food-derived antihypertensive peptide development [6,7]. Soy proteins constitute about 35–40% of the total dry weight of the bean and the major storage proteins are glycinin (11S globulin) and β-conglycinin (7S globulin). These storage proteins account for about 65–85% of the total soy proteins [8]. Recent studies have shown that consumption of fermented soybean meal can reduce the risk of cardiovascular disease mortality, cardiovascular disease, stroke, and coronary heart disease risk [9,10]. However, to release bioactive peptides from parent proteins, enzyme treatment and fermentation are the two most common methods used. While enzyme hydrolysis saves time and enhances scalability and predictability of peptides, fermentation (though a relatively slow process) is a cheaper strategy to generate various bioactive peptides with diverse activities. Combining the two methods could enhance the number and kinds

of peptides released from the parent protein. A number of lactic acid bacteria such as *Pediococcus pentosaceus* and *Lactobacillus casei* have been used to ferment soy proteins to release antihypertensive peptides of high potency [2,11,12]. In an earlier study, we found that *Lactobacillus rhamnosus* EBD1 isolated from Korean fermented soybean (doenjang) had strong proteolytic activity and could be helpful in generating bioactive peptides.

Therefore, to develop a soybean product with a strong antihypertensive effect, we first hydrolyzed soy protein isolates (SPI) with Prozyme and subsequently fermented the hydrolysate with *Lactobacillus rhamnosus* EBD1 to obtain a product we named P-SPI. This combined strategy enhanced the degree of hydrolysis of the proteins. The long term effect of P-SPI consumption on the systolic blood pressure of spontaneous hypertensive rats (SHR) was studied.

#### **2. Results**

#### *2.1. The Extent of Hydrolysis*

*L. rhamnosus* EBD1 was able to grow on SPI as a sole nitrogen source and both Prozyme and *L. rhamnosus* digested SPI to various extents. Analysis of hydrolysates by RP-HPLC showed that approximately 15% of the substrate was hydrolyzed by Prozyme after treatment for 1 h. However, subsequent fermentation of the hydrolysate with *L. rhamnosus* EBD1 for 48 h resulted in hydrolysis of approximately 55% of the initial SPI concentration (Figure 1).

**Figure 1.** RP-HPLC chromatograms of various stages of sample preparation: (**A**) represents the chromatogram of raw soy protein isolates (SPI), (**B**) represents the Prozyme treated SPI and (**C**) represents the product (P-SPI).

#### *2.2. ACE-Inhibitory Ability of Hydrolysates and Fermentates*

Both the enzyme hydrolyzed and fermented hydrolysates inhibited ACE to various extents (Table 1). As shown, fermentation of the enzyme hydrolyzed samples improved the ACE inhibitory activity from 60.8% to 88.24%.


**Table 1.** Angiotensin-converting enzyme (ACE) inhibitory activity of processed SPI

Data shows mean ± SD (*n* = 3). Values represent the means of three replicates ± S.D. n.d: Not determined.

#### *2.3. ACE Inhibitory Peptides from P-SPI*

All the peptides identified in the peptide profile are displayed in Supplementary Table S1. Since 3008 peptides were identified and could not be individually synthesized/eluted and tested in vitro for ACE inhibitory activity, the peptide sequences were screened using an in silico platform (http://crdd.osdd.net/raghava/ahtpin/index.php) developed by Kumar et al. [13] to predict potential ACE inhibitory peptides. Although many potential ACE inhibitory peptides were identified, peptides IAKKLVLP, PDIGGFGC, PPNNNPASPSFSSSS, GPKALPII and IIRCTGC were most abundant. The peptides were synthesized and their ACE inhibitory activities were confirmed in vitro. Among the peptides tested, IIRCTGC showed the strongest inhibitory activity of 83 ± 0.9%, followed by PPNNNPASPSFSSS and GPKALPII, which were not significantly different in their inhibitory abilities (*p* > 0.05) (Figure 2). Meanwhile, peptide PPNNNPASPSFSSSS showed an inhibitory activity of 18 ± 7%, while IAKKLVLP displayed the least inhibitory activity of 10 ± 3%. Captopril (the positive control) showed the strongest inhibitory activity of 94 ± 4%.

**Figure 2.** The ACE inhibitory ability of selected peptides from P-SPI. Bars represent the means of three replicates (*n* = 3) ± SD, \**p* < 0.05.

#### *2.4. The Effects of Gastrointestinal Enzymes on ACE Inhibitory Activity of P-SPI*

When P-SPI was subjected to pepsin digestion, its ACE inhibitory activity was not significantly altered. Also, subsequent treatment of the peptides with pancreatin did not affect the ACE inhibitory ability (*p* > 0.05) as shown in Figure 3.

**Figure 3.** The effects of gastrointestinal enzymes on ACE inhibitory activity. Bars represent means of three replicates (*<sup>n</sup>* = 3) <sup>±</sup> SD, <sup>a</sup> *<sup>p</sup>* < 0.05.

#### *2.5. Blood Pressure Reducing Effects of P-SPI*

The mean SBP measured for all the SHRs from the four experimental groups prior to treatment (zero time) was 179 ± 5.6 mm Hg (*n* = 20). Oral administration of P-SPI at a dose of 100 mg/kg resulted in a significant reduction in SBP when compared to the lack of SBP reduction by physiological saline. The decrease in SBP was maximal at the 4th week of oral administration (−19 ± 4 mm Hg). By contrast, oral administration of 10 mg/kg P-SPI induced a slight decrease in SBP which was maximum at the 4th-week post-administration (−11.2 ± 2 mm Hg). However, all the P-SPI doses reduced SBP significantly when compared to the negative control group and maintained lower blood pressures throughout the feeding period. 50 mg/kg captopril administration resulted in an SBP reduction of 22 mm Hg (Figure 4).

**Figure 4.** Systolic blood pressure changes from the baseline are expressed in absolute values (mmHg) and data are mean ± SEM from 5 determinations. Data points with the same alphabets are not significantly different (a *p* > 0.05) using one-way ANOVA followed by Duncan tests. NC: Negative control, PC: Positive control, G1: Group 1, G2: Group 2.

#### *2.6. Effects of P-SPI Consumption on Body Weight Gain*

P-SPI (10 mg/kg and 100 mg/kg) and captopril administration significantly reduced weight gain from the 2nd week to the 4th week of feeding (*p* < 0.05) relative to SHRs that were given physiological saline. The weights of the rats in these groups were however maintained after the 4th week up to the 6th week of feeding (Figure 5).

**Figure 5.** Time course of weight changes (Δ weight) after oral administration of physiological saline, 50 mg/kg, and 100 mg/kg body weight of P-SPI. Each data point represents mean ± SEM from 5 determinations, and data points with the different alphabets are significantly different (*p* < 0.05) using one-way ANOVA followed by Duncan tests. NC: Negative control, PC: Positive control, G1: Group 1, G2: Group 2.

#### *2.7. Effects of P-SPI Consumption on Feed Intake*

Generally, neither the administration of P-SPI nor captopril affected the quantity of feed intake of SHRs (Figure 6). The quantity of feed consumed by the rats in the test and control groups was not significantly different throughout the course of study (*p* > 0.05).

**Figure 6.** Each data point represents mean ± SEM from 5 determinations, and data points with the different alphabets are significantly different (*p* < 0.05) using one-way ANOVA followed by Duncan tests. NC: Negative control, PC: Positive control, G1: Group 1, G2: Group 2.

#### **3. Discussion**

To generate bioactive peptides from soy proteins, proteolytic enzymes or microorganisms would be required to release the peptides from the parent proteins. The use of fermented soybean or enzyme hydrolyzed soy proteins for reducing high blood pressure is well recognized. However, very few studies (if any) have exploited the combined effects of proteolytic enzymes and fermentation for developing antihypertensive foods.

#### *3.1. Hydrolysis and Fermentation of Soy Proteins*

Hydrolysis with proteases and subsequent fermentation reduces the time needed for efficient substrate hydrolysis when only fermentation is employed. Also, this strategy allows the generation of new antihypertensive peptide sequences that would not have been generated if only a single method was applied. The combined processing method in this work enhanced hydrolysis of the soy protein compared to the enzyme treatment alone. *Lactobacillus rhamnosus* is known for its well-developed protein degradation machineries with which it hydrolyzes proteins. Peptides generated by the cell-envelope proteinase hydrolysis are transported into the bacterial cell for further hydrolysis by peptidases so as to meet its nitrogen requirements [14,15]. The cell number increases from 2 × <sup>10</sup><sup>8</sup> cells to 10<sup>9</sup> cells after 48 h incubation.

#### *3.2. Effects of P-ISP on ACE Activity*

The product obtained (P-SPI) displayed strong ACE inhibitory ability of 88.24 ± 3.2% (IC50 = 0.592 mg/mL) compared to raw SPI. Using LC-ESI-TOF-MS/MS, we identified 3008 peptides in P-SPI samples which were generated by the processing method. In silico screening of peptides using the AHTpin software available at http://crdd.osdd.net/raghava/ahtpin/index.php revealed many potential ACE inhibitory peptides, among which IAKKLVLP, PDIGGFGC, PPNNNPASPSFSSSS, GPKALPII, and IIRCTGC were most abundant. Since these 5 peptides satisfied some of the common structural features described for many food-derived ACE inhibitory peptides [16–18], they were synthesized and their inhibitory activities were tested in vitro. As seen in Figure 2, only PPNNNPASPSFSSSS, GPKALPII, and IIRCTGC were strong ACE inhibitors, while IAKKLVLP and PDIGGFGC were weak inhibitors. Earlier studies about structural-activity relationships between peptides and ACE inhibition indicated that peptides whose C-terminal tripeptides are hydrophobic show a stronger binding ability to ACE [16], and this could account for why GPKALPII showed good ACE inhibition. Also, peptides with branched-chain aliphatic amino acids or hydrophobic amino acid at the N-terminal have been shown to be good competitive inhibitors of ACE. These criteria make IIRCTGC and PPNNNPASPSFSSSS good ACE inhibitory candidates [17,18].

#### *3.3. Effect of Gastrointestinal Enzymes on P-ISP Activity*

In the gut, ingested peptides encounter gastrointestinal enzymes and may be hydrolyzed. This may either result in loss of activity or generation of other potent peptides. Treatment of P-SPI with gastrointestinal enzymes in vitro, however, did not significantly affect ACE inhibitory activity (*p* > 0.05), indicating that the ACE inhibitory peptides were either resistant to gastrointestinal enzyme digestion or retained their activity even after digestion.

#### *3.4. The Effect of P-ISP Consumption on Systolic Blood Pressure*

Recent studies have indicated that systolic blood pressure is a better factor for predicting cardiovascular disease than diastolic blood pressure [19,20]; hence, reducing SBP reduces the risk of CVD. Results from this study showed a clear reduction in SBP when SHR were fed with P-SPI (10 mg/kg BW and 100 mg/kg BW) for six weeks. Nevertheless, the effect of P-SPI was less pronounced than the effect of captopril (a standard antihypertensive drug). However, this study was aimed at developing a functional food that could prevent or reduce high blood pressure rather than curing the condition. Compared to synthetic antihypertensive drugs, food-derived antihypertensive peptides have been reported to have no side effects, have higher tissue affinities, and maybe more slowly cleared from tissues [21].

#### *3.5. The Effect of P-ISP Consumption on Feed Satiety and Body Weight Gain*

Many studies have found a strong association between obesity and hypertension in humans [22–24]. This is because an increase in body weight seems to be followed by an increase in blood pressure. However, whether obesity precedes hypertension or hypertension leads to obesity still remains unclear. Yet, due to the association between these two conditions, we studied the effect of P-SPI consumption on SHR body weight. Relative to the untreated group, P-SPI reduced SHR body weight gain significantly from the second week to the fourth week of feeding. SHR body weight was, however, maintained (*p* > 0.05), though they were continuously fed with P-SPI from the 4th–6th weeks. A similar observation was made among SHRs administered with captopril.

Some studies have shown that certain bioactive peptides decrease appetite and lead to reduced food intake, resulting in reduced weight gain [25–27]. For this reason, we checked whether P-SPI consumption affected feed intake relative to control groups. It was observed that P-SPI consumption did not have any significant effect on the quantity of feed consumed by the rats (Figure 6). It is therefore possible that the reduction in weight gain might have been caused by other reasons apart from increased satiety.

The relationship between long term soy protein hydrolysate consumption and blood pressure has been discussed in many previous reports. For instance, Yang et al. [28] reported that pepsin hydrolyzed soy peptides reduced SBP (up to −35 mmHg) after SHR were fed for twelve weeks. Rhyu [29] also observed an SBP reduction of about −13 mmHg when SHR were fed with fermented soybean paste for seven weeks. These studies, however, used low molecular weight peptides mixed with some other foods. We believe our study is a better representation of how food could be processed and directly consumed as a functional food for reducing high blood pressure. Our results are similar to Wu et al. [30], who recorded an SBP reduction of about −20 mm Hg when SHRs were fed with 100 mg/kg BW of soy proteins hydrolyzed with Alcalase. It is, however, obvious that the different enzymes used for SPI hydrolysis result in different peptides with different potencies for ACE inhibition. For this reason, different treatments would result in different abilities to lowering blood pressure. Meanwhile, any small reduction in high blood pressure could beneficially reduce the risk of cardiovascular diseases [31].

In conclusion, our data demonstrates that Prozyme hydrolysis followed by *Lactobacillus rhamnosus* EBD1 fermentation enhanced bioactive peptide generation and improved ACE inhibition. Consumption of P-SPI could therefore be helpful in reducing high blood pressure in humans. Meanwhile, studies about the mechanism(s) by which P-SPI reduces blood pressure are warranted.

#### **4. Materials and Methods**

#### *4.1. Chemicals and Cultures*

Soybean protein isolates (Pro-Fam®) were obtained from Archer Daniels Midland Company (ADM, Decatur, Illinois, USA). Hip-His-Leu, ACE (from rabbit lung), Pepsin (from porcine gastric mucosa), and Pancreatin (from porcine pancreas) were obtained from Sigma-Aldrich (Yongin, Korea). Prozyme 2000P was obtained from Bison Corporation, Gyunggi-Do, Korea. *Lactobacillus rhamnosus* EBD1 was obtained from the Department of Food Science and Biotechnology (Chuncheon-si, Gangwon-do, Korea) and used for this study because it showed strong proteolytic ability in our earlier study [32]. The bacteria stock culture was maintained at −80 ◦C in de Man, Rogosa, and Sharpe (MRS) broth (Difco, Hongcheon, Korea), containing 20% glycerol (*v*/*v*). The culture was streaked on MRS agar and cultured at 37 ◦C for 24 h. A single colony was then transferred into MRS broth at 37 ◦C and harvested at the exponential phase of growth.

#### *4.2. Preparation of Protein Hydrolysates and Fermentation*

SPI was hydrolyzed with Prozyme according to the enzyme manufacturer's instructions. Briefly, 20% (*w*/*v*) of SPI in distilled water was prepared, and the pH was adjusted to 7. The protein was digested by 3% Prozyme at 55 ◦C for 1 h. The sample was then autoclaved at 121 ◦C to stop the enzyme activity and to sterilize the sample. *Lactobacillus rhamnosus* EBD1 (2 × 108 cfu/mL) in the starter culture was inoculated into a 500 mL Erlenmeyer flask containing the hydrolyzed SPI. Cultivation was carried out at 37 ◦C with 150 rpm of agitation. After 48 h incubation, the fermented sample (P-SPI) was freeze-dried with a TFD5505 table top freeze dryer (ilshinBioBase Co. Ltd., Dongducheon-si, Korea) and stored at −20 ◦C for further analysis.

#### *4.3. Determination of Extent of Proteolysis*

The degree of proteolysis of the SPI samples (raw SPI and P-SPI) were analyzed as reported earlier [33] with slight modifications. Briefly, SPI hydrolysates were analyzed by reversed-phase high-performance liquid chromatography (RP-HPLC) using a Waters system (Waters Corporation, Milford, MA, USA) equipped with a 1525 Binary HPLC pump, a 2996 Photodiode Array Detector, and a 717 plus Autosampler. An aliquot (90 μL, 10 mg/mL) of the sample was applied to a Symmetry® C18 5 μm, 4.6 × 150 mm column (Waters, Milford, MA, USA). The column was developed at a flow rate of 1 mL/min at 40 ◦C. Elution was performed with a linear gradient of solvent B (acetonitrile with 1% TFA) in solvent A (water with 1% TFA) from 0–80% in 60 min. Detection of peptides and proteins was carried out at 214 nm. The extent of proteolysis was calculated by expressing the chromatographic peak areas of either enzyme treated alone or enzyme treated and fermented ISP hydrolysates as a percentage of that of raw SPI.

The degree of hydrolysis after enzyme treatment was calculated as:

$$\text{Degree of hydrolysis} = \frac{100\% \times (\text{Peak area of A} - \text{Peak area of B})}{(\text{Peak area of A})} \tag{1}$$

The degree of hydrolysis after enzyme treatment and fermentation was calculated as:

$$\text{Degree of hydrolysis} = \frac{100\% \times (\text{Peak area of A} - \text{Peak area of C})}{(\text{Peak area of raw A})} \tag{2}$$

where A represents the chromatogram of raw SPI, B represents the Prozyme treated SPI, and C represents P-SPI.

#### *4.4. In-Vitro Assay for ACE Inhibitory Activity*

ACE inhibitory activity was determined by the procedure described by Cushman & Cheung, [34]. Briefly, 20 μL of ACE inhibitor solution with 50 μL of 5mM HHL in 100mM sodium borate buffer (pH 8.3) containing 0.3M NaCl was incubated at 37 ◦C for 5 min. To initiate the reaction, 10 μL of 0.1 U/mL ACE solution was added, and the mixture was incubated at 37 ◦C for 30 min. The reaction was terminated by adding 100 μL of 1M HCl, and the reaction mixture was mixed with 1 mL ethyl acetate. The mixture was vortexed for 60 s and centrifuged at 2000× *g* for 5 min. The ethyl acetate layer (0.8 mL) was transferred to a 1.5 mL Eppendorf tube and evaporated in a water bath. The hippuric acid (HA) in the tube was dissolved with distilled water (0.8 mL). The amount of HA formed was measured at 228 nm using a biospectrometer (Eppendorf Biospectrometer® fluorescence, Eppendorf Korea Ltd. Korea). The amount of HA liberated from Hip-His-Leu under this reaction conditions without an inhibitor was used as a control. The extent of inhibition was calculated as

$$\text{ACE inhibition} = 100\text{ }\% \times [(\text{B} - \text{A})/\text{B}]$$

where A is the optical density in the presence of ACE and ACE inhibitory component and B is the optical density without ACE inhibitory component.

For the determination of IC50, series of dilutions containing 5000 μg/mL, 500 μg/mL, 50 μg/mL, 5 μg/mL, 0.5 μg/mL, and 0.05 μg/mL of P-SPI samples were prepared. The amount of peptides required to suppress 50% ACE activity was calculated from the regression curves observed for each fraction.

#### *4.5. Identification of Peptides by Mass Spectrometry*

Liquid chromatography-electrospray ionization-quantitative time-of-flight tandem mass spectrometry experiments (LC-ESI-TOF-MS/MS) were carried out at the National Instrumentation Center for Environmental Management of Seoul National University in Korea, according to an earlier method [35]. Analysis was done using high-performance liquid chromatography (UltiMate 3000 Series system, DIONEX Technologies, Sunnyvale, CA, USA), an integrated system comprising an auto-switching nano pump, an autosampler (TempoTM nano LC system; MDS SCIEX, Seoul, Korea), and a hybrid quadrupole-time-of-flight (TOF) mass spectrometer (QStar Elite; Applied Biosystems, USA) fitted with a fused silica emitter tip (New Objective, Woburn, MA, USA). To ionize the samples, nano-electrospray ionization was used. 1.5 g of the P-SPI was dissolved in 50 mL of double distilled water. Fractions (1.5 μL) of the sample were injected into the LC-nano ESI-MS/MS system. The sample was trapped on a ZORBAX 300SB-C18 trap column (300-μm i.d × 5 mm, 5-μm particle size, 100 pore size, Agilent Technologies, Santa Clara, California, USA, part number 5065-9913) and washed for 6 min with gradient with 98% solvent A [water/acetonitrile (98:2, *v*/*v*), 0.1% formic acid] and 2% solvent B [Water/acetonitrile (2:98, *v*/*v*), 0.1% formic acid] at a flow rate of 5 μL/min. The peptides were separated on a ZORBAX 300SB-C18 capillary column (75-μm i. d × 150 mm, 3.5 μm particle size, 100 pore size, part number 5065-9911) at a flow rate of 300 nL/min with a gradient at 2%–35% solvent B over 30 min, then from 35%–90% over 10 min, followed by 90% solvent B for 5 min, and finally 5% solvent B for 15 min. Electrospray was performed at an ion spray voltage of 2000 eV through a coated silica tip (FS360-20-10- N20-C12, PicoTip emitter, New Objective). The peptides were analyzed automatically using Analyst QS 2.0 software (Applied Biosystems, Seoul, Korea). The range of m/z values was 200–2000. Peptides of interest were ordered at >95% purity from GL Biochem (Shanghai, China) Limited (Shanghai, China).

#### *4.6. Effects of Gastrointestinal Enzymes on P-SPI (In Vitro)*

A two-stage simulated gastrointestinal digestion was carried out on P-SPI similar to an earlier report [2]. Pepsin (0.2 mg) was added to 10 mL of 1 mg/mL P-SPI solutions and adjusted to pH 2.0 using 1 M HCl. The samples were incubated at 37 ◦C. After 120 min, the pH was raised to 7.5 by adding 1 M NaOH. Pancreatin (0.2 mg) was added and the samples were further incubated at 37 ◦C for 180 min. The reaction was stopped by heating at 80 ◦C for 10 min in a water bath, followed by cooling at room temperature. The samples were analyzed for their ACE inhibitory abilities.

#### *4.7. Long-Term Effect of P-SPI Consumption on SHR Blood Pressure*

All animal experimental procedures were in accordance with the ethical procedures and scientific care by Kangwon National University-Institutional Animal Care and Use Committee (approval no. KW-151127-1, 13 August 2018). Twenty male SHRs weighing 250–300 g were used (Charles River Laboratories, Barcelona, Spain). The rats were divided randomly into four groups. Rats were housed in temperature-controlled rooms (23 ◦C) with 12 h light/dark cycles and consumed tap water and standard diets ad libitum. Experimental procedures were conducted in accordance with the Kangwon National University animal ethics committee guidelines. Indirect measurement of systolic blood pressure (SBP) in awake restrained rats was carried out by the non-invasive tail-cuff method using computer-assisted non-invasive blood pressure equipment (NIBP 76-0173 unit with LE5160R cuff & transducer, Sang Chung Commercial Co., Ltd., Kangnam-Ku, Korea). The rats were kept at 37 ◦C

for 15 min to make the pulsations of the tail artery detectable. By gastric intubation, each group of rats was either administered with 10 mg of P-SPI per kg body weight (BW), 100 mg of P-SPI per kg BW, captopril (50 mg/kg BW), or 750 μL physiological saline once daily. SBP was measured before peptide intake (week 0), at the 2nd, 4th and 6th weeks after intake. Each value of SBP was obtained by averaging five successful measurements without disturbance of the signal. Changes in SBP were calculated as the absolute difference (in mmHg) with respect to the basal values of measurements obtained just before starting the treatments.

#### *4.8. Effect of P-SPI Consumption on Feed Consumption and Weight Gain*

To assess the amount of feed consumed by the rats, the mass of feed supplied to the rats each morning was recorded, and the remaining feed in the feeding trough was weighed the next morning. The difference in mass was recorded as the amount of feed consumed by the rats.

For weight gain assessment, the weight of each rat in each group was recorded once every week (from week 0–6) and the changes in body weight were noted as weight gain or loss.

#### *4.9. Statistical Analysis*

Baseline systolic blood pressure was defined as the mean of the values measured in the first run-in period. Blood pressures, weight, and the amount of feed consumed were presented as the mean value ± standard deviations (SD) for all SHR in each group. The outcomes for each week between groups were analyzed with one-way ANOVA followed by Duncan tests. Differences were considered significant when *p* < 0.05. All statistical analysis was done using GraphPad Prism version 5.01 (GraphPad Software, Inc, La Jolla, CA, USA).

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/6/ 1496/s1. Table S1: LC-ESI-TOF-MS/MS analysis of P-SPI-derived peptides.

**Author Contributions:** Conceptualization: D.-H.O. and M.H.P.; methodology: E.B.-M.D., F.K.O., R.C. and J.-H.K.; writing, reviewing and editing: E.B.-M.D. and D.-H.O.; all authors gave their feedback, edited and approved the final manuscript.

**Funding:** This work was funded by the Korean Ministry of Small and Medium scale Enterprises and Startups under the "Regional Specialized Industry Development Program (R&D, R0006438)" supervised by the Korea Institute for Advancement of Technology (KIAT). Grant number C0502529.

**Acknowledgments:** We thank the central laboratory of Kangwon National University for their assistance in LC-MS analysis.

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Insect Cecropins, Antimicrobial Peptides with Potential Therapeutic Applications**

**Daniel Brady 1, Alessandro Grapputo 1, Ottavia Romoli 1,2 and Federica Sandrelli 1,\***


Received: 31 October 2019; Accepted: 20 November 2019; Published: 22 November 2019

**Abstract:** The alarming escalation of infectious diseases resistant to conventional antibiotics requires urgent global actions, including the development of new therapeutics. Antimicrobial peptides (AMPs) represent potential alternatives in the treatment of multi-drug resistant (MDR) infections. Here, we focus on Cecropins (Cecs), a group of naturally occurring AMPs in insects, and on synthetic Cec-analogs. We describe their action mechanisms and antimicrobial activity against MDR bacteria and other pathogens. We report several data suggesting that Cec and Cec-analog peptides are promising antibacterial therapeutic candidates, including their low toxicity against mammalian cells, and anti-inflammatory activity. We highlight limitations linked to the use of peptides as therapeutics and discuss methods overcoming these constraints, particularly regarding the introduction of nanotechnologies. New formulations based on natural Cecs would allow the development of drugs active against Gram-negative bacteria, and those based on Cec-analogs would give rise to therapeutics effective against both Gram-positive and Gram-negative pathogens. Cecs and Cec-analogs might be also employed to coat biomaterials for medical devices as an approach to prevent biomaterial-associated infections. The cost of large-scale production is discussed in comparison with the economic and social burden resulting from the progressive diffusion of MDR infectious diseases.

**Keywords:** antimicrobial peptides; insects; Cecropins; Cec-analogs; MDR infectious diseases

#### **1. Introduction**

The spread of infectious diseases resistant to conventional treatments has become an alarming phenomenon worldwide, prompting the United Nations and international agencies to call for immediate and coordinated actions to avoid a possible global drug-resistance crisis [1]. Drug-resistance phenomena involve not only antibacterial compounds, but also antiviral, antifungal, and antiprotozoal therapeutics in all countries, independent of their economic level. Currently, estimates indicate that drug-resistance cases result in 700,000 deaths per year worldwide, and without direct action, annual death tolls could reach 10 million by 2050 [1]. Research and development of new therapeutics have been included at the forefront of the proposed actions to tackle the global antimicrobial resistance phenomenon [1]. Several lines of evidence indicate that the utilization of antimicrobial peptides (AMPs) represents a compelling option [2,3].

AMPs are naturally occurring peptides produced as a first line of defense against pathogenic infections by virtually all living species, from bacteria to mammals [2]. AMPs play an essential role in those organisms that lack an adaptive immune system and base their defense only on the innate immune response, such as invertebrates. Of these, Insecta is the largest animal class on Earth, containing 50% of all known animal species, and represents a wide source of AMPs. To date, 305 out of the 3087 AMPs listed in the Antimicrobial Peptide Database (APD; Available online: http://aps.unmc.edu/AP [4]) are derived from insects. Notwithstanding, these numbers are likely to increase extensively given the current growth of accessible genomic, transcriptomic, and proteomic insect datasets, which will accelerate the identification of new putative AMPs available for subsequent analyses and characterization.

First identified about 40 years ago, a wide variety of insect AMPs has since been characterized. These molecules have been intensively studied, not only for their physiological role in insect immunity, but also as potential alternatives to conventional antibiotics in the treatment of infectious diseases [5–7]. Moreover, some insect AMPs have been shown to possess immunomodulatory functions as well as anticancer activity [5,6]. These biological properties, combined with modern advances in biotechnology, have resulted in a renewed interest in insect AMPs and their potential to combat modern biomedical challenges.

Insect AMPs can be classified on the basis of their sequence and structure into three groups: (i) α-helical peptides, lacking in cysteine residues (e.g., Cecropins (Cecs) and Moricins); (ii) β-sheet cysteine-rich peptides (e.g., Defensins and Drosomycins); and (iii) linear-extended peptides, often characterized by high proportions of peculiar amino acids (aa) such as proline, arginine, tryptophan, glycine, and histidine. Both proline-rich peptides (e.g., Apidaecins, Drosocins, and Lebocins) and glycine-rich AMPs (e.g., Attacins and Gloverins) belong to this group. As the different classes of insect AMPs have been recently reviewed in [5–7], here we focus on Cecs, one of the largest groups of insect AMPs. We report a comprehensive overview of the Cec family in insects, and provide up-to-date models explaining their mode of action. We then highlight the antimicrobial, anti-inflammatory, and antitumor activities of natural Cecs and Cec-like peptides, as well as of synthetic Cec-analogs, which carry different types of sequence modifications. The potential benefits and limitations in the development of Cec-based antibacterial therapeutics are also presented.

#### **2. The Family of Cecropins in Insects**

Cecs and other Cec-like peptides, including Sarcotoxins, Stomoxins, Papiliocin, Enbocins, and Spodopsins, form the most abundant family of linear α-helical AMPs in insects (Table 1). Cec AMPs were first isolated from the hemolymph (insect blood) of the lepidopteran *Hyalophora cecropia* and were characterized for their antimicrobial activity against several Gram-positive and negative bacteria [8–10]. Subsequently, these peptides have been identified in two other orders of Hexapoda, Coleoptera and Diptera, as well as in other species of Lepidoptera [7,11]. In evaluating several genomes, the identification of Cec and Cec-like peptide sequences was not successful in other insect orders ([11]; this review), including Hymenoptera, which is considered the sister clade of the other holometabolous insects [12]. However, Cecs have been identified in other animals, such as Styelin in tunicates [13], and Cec P1, first isolated from pigs [14], but then found to belong to the Nematode *Ascaris suum* [15]. Cec-like peptides have been also identified in the bacterium *Helicobacter pylori* [16]. Since these peptides derive from the N-terminal part of ribosomal protein L1 (RpL1) and are similar to Cecs from *H. cecropia*, Pütsep and colleagues suggested that Cecs may have evolved from an early prokaryote RpL1 gene [16]. Indeed, the homology of Cecs has been debated and some authors consider them a single family, with Dermasptin (amphibians), Ceratotoxin (insects), and Pleurocidin (fish) forming a Cec superfamily [17]. Indeed, the members of the Cec family show sequence similarity that enabled the identification of a first sequence signature of Cecs from some species of Brachycera (Diptera) and Lepidoptera (i.e., [KR]-[KRE]-[LI]-[ED]-[RKGH]-[IVMA]-[GV]-[QRK]-[NHQR]-[IVT]-[RK]-[DN]-[GAS]-[LIVSAT][LIVE]- [RKQS]-[ATGV]-[GALIV]-[PAG]) [17]. This has been updated to include Culicomorpha (mosquitoes) and nematodes (i.e., [KRDEN]-[KRED]-[LIVMR]-[ED]-[RKGHN]-X(0,1)[IVMALT]-[GVIK]-[QRKHA]- [NHQRK]-[IVTA][RKFAS]-[DNQKE]-[GASV]-[LIVSATG][LIVEAQKG]-[RKQSGIL]-[ATGVSFIY]- [GALIVQN]) [18]. However, the lower similarity between Cecs from insects and other organisms and the lack of Cec-like peptides outside the clade of Coleoptera, Diptera, and Lepidoptera prompted other authors to suggest that insect *Cec* genes may have evolved just once in the common ancestor of these holometabolous orders, implying that insect and non-insect Cecs are not homologous [11].


*Int. J. Mol. Sci.* **2019** , *20*, 5862

65


*Int. J. Mol. Sci.* **2019** , *20*, 5862

**Table 1.** *Cont*.

Although most Cec diversity is found in insect taxa with whole genome sequences, phylogenetic analysis suggests that there is a significant undiscovered diversity in other holometabolous insects. Within different species, *Cec* genes are generally present in a variable number of copies organized in clusters or dispersed in the genome and can include both functional and non-functional elements (pseudogenes). For example, among Diptera, *Drosophila melanogaster* shows four functional genes (*Cec A1*, *A2*, *B*, and *C*) and two pseudogenes (Cec ψ*1* and *Cec* ψ*2*), clustered in a ~7-kb region [68,69]; to date, *Musca domestica* displays the largest gene family, characterized by 12 *Cec* members [70]. Among Lepidoptera, the *H. cecropia Cec* locus spans ~20 kb and contains three *Cec* genes (A, B, and D) [9,71], coding for three Cec A, B, and D functional peptides. Moreover, *H. cecropia* shows the additional Cec forms C, E, and F, that have been isolated in low amounts, and classified as allelic variants or degradation products of the three main A, B, and D forms [9]. In the domesticated silkworm *Bombyx mori*, the *Cec* gene family is composed of at least 14 elements (two *Cec A* (*A1* and *A2*), six *Cec B* (*B1*–*B6*), one *Cec C*, two *Cec D* (*D* and *D2*), one *Cec E*, and two *enbocins* (*enb* 1 and 2)), organized in two clusters, mapping on two different chromosomes [72]. In Coleoptera, functional *Cec* genes have been identified in species like *Acalolepta luxuriosa* (Cec; [20]), *Oxysternon conspicillatum* (Oxysterlins; [19]), and *Paederus dermatitis* (Sarcotoxin Pd; [21]), whereas only non-functional *Cec* pseudogenes have been reported in the coleopteran model *Tribolium castaneum* [73,74].

Phylogenetic analyses and single genome sequencing revealed that insect Cec and Cec-like peptides originated via gene duplication and evolved via a birth and death model of gene evolution [72,75]. The occurrence of gene duplication events is confirmed by the presence of transposable elements in both 5' and 3' flanking regions, and repeated gene duplication within species. Furthermore, tandem gene arrangement within the genome, non-functionalization, and loss of some *Cec* gene copies, and the presence of highly divergent and highly similar gene copies within species all support the gene duplication hypothesis [75,76]. Compared to other AMPs, Cecs show no sites under positive selection [77,78], but frequent duplication events may be adaptive, enabling new gene copies to mutate and acquire novel antimicrobial properties [79].

Phylogenetic analysis (Figure 1) [11,72,76] shows that Cecs from Lepidoptera form a monophyletic group (derived from a single ancestral gene) and evolved independently in this order of insects [22]. In contrast, the phylogenetic relationships of Cecs from Diptera and Coleoptera are more complex. Complementing previous phylogenetic analyses [76,80], we included new data from mosquitos and several Coleoptera species. Cecs from Diptera and Coleoptera are both paraphyletic, suggesting that Cecs originated before these lineages diverged. Within Diptera, Cecs from Brachycera (which include *Drosophila*) form a monophyletic group, which is closely related to that of Lepidoptera and is distinct from that of Culicomorpha (mosquitos) (Figure 1).

**Figure 1.** Phylogenetic tree of insect Cecropins (Cecs) and Cec-like peptides. Maximum likelihood mid-point rooted phylogenetic tree showing the relationships of insect Cecs and Cec-like peptides. The tree was obtained with FastTree 2.1.5 software with the WAG + Γ model [81]. Lepidoptera peptides are shown in red, Trichoptera in orange, Diptera Brachycera in dark blue, Diptera Culicomorpha in light blue, and Coleoptera in green. Full-length Cecs and Cec-like peptides were downloaded from the OrthoDB database (Available online: https://www.orthodb.org/), which contains 230 sequences in 61 species of Lepidoptera and Diptera. Other sequences, including those from *Simulium*, Trichoptera, and Coleoptera, were downloaded from NCBI and UniprotKB. Sequences were aligned with ClustalW using default parameters in Geneious 8.1.9 (BioMatters). Identical sequences within species were removed leaving a total of 254 Cecs and Cec-like peptides. Either UniprotKB or NCBI accession number are reported for each sequence in the tree. Circle at the nodes indicate node support obtained with Shimodaira–Hasegawa-like local support. Shade (as shown in the legend) and circle size are proportional to the node support value (0–1). The scale bar corresponds to estimated amino acid substitutions per site.

#### **3. Cec Gene Expression and Mechanism of Action Against Microorganisms**

In the absence of any infections, *Cec* genes can be constitutively expressed at low levels in different body compartments, as demonstrated in the *Drosophila* reproductive tract [82] or in the silkworm *B. mori* midgut or fat body (a structure equivalent to the mammalian liver) [83]. Following an immune challenge, *Cecs* become highly transcribed in several tissues, such as gut epithelia or epidermis during local infections, and the fat body and hemocytes, during systemic infections (e.g., [51,82,83]). Like other

AMPs, Cecs are translated as immature pre-peptides, undergo proteolytic cleavage of the N-terminal signal peptide, and are secreted in a mature and active form [5,7]. Before maturation, Cec sizes range between 58 and 79 aa, while active forms contain between 34 and 55 residues (Table 1). Experimental and computational analyses indicated that Cec and Cec-like peptides are structurally related and are characterized by an N-terminal basic, amphipathic domain linked to a more hydrophobic C-terminal segment, through a flexible proline- and glycine-rich hinge region (Figure 2A; [5,7,84]).

**Figure 2.** Cecropin (Cec) structure and mechanisms of action against bacteria. (**A**) Structure of the mature 35 aa *B. mori* Q53 Cec B natural variant [53] obtained using SWISS-MODEL (Available online: https://swissmodel.expasy.org/), showing N- and C-terminal α-helices linked through a flexible hinge region. (**B**) Model of action against bacteria. Cecs associate with the bacterial membrane, with the long axes of the α−helical domains parallel to the lipid bilayer surface. Polar residues interact with the lipid phosphates; non-polar residues bury in the hydrophobic core of the membrane. At high concentrations (upper part), Cecs form a carpet-like structure with detergent-like properties, disrupting membranes. At lower concentrations (lower part), Cecs form pores, which affect the cellular electrolyte balance, causing bacterial death [85]. The pore is formed of different Cec molecules organized as oligomers, with C-terminal hydrophobic domains submerged into the phospholipidic hydrophobic chains [86]. The red rectangle represents the N-terminal helix, the blue one the C-terminal helix; the dark blue ellipse indicates the C-terminal amidated residue.

Insect Cecs and Cec-like peptides are generally active against Gram-negative bacteria and to a lesser extent, Gram-positive bacteria (Table 1). Some have been demonstrated to also exhibit antifungal activity (Table 1). Moreover, Cec and Cec-like peptides were shown to have a low toxicity against normal mammalian cells and a weak or absent hemolytic effect against mammalian erythrocytes (Table 1). As for other cationic AMPs, the ability of these peptides to target microorganisms without interacting with host eukaryotic cells relies on the difference in composition of the respective cell membranes. Bacterial membranes are predominantly composed of negatively charged compounds (e.g., phosphatidylglycerol, cardiolipin, and phosphatidylserine), while eukaryotic membranes are positively charged by the presence of zwitterionic phospholipids and cholesterol [87]. Furthermore, Gram-negative bacteria possess an external membrane rich in negatively charged Lipopolysaccharides (LPS, also known as endotoxin), whereas in Gram-positive bacteria, the peptidoglycan is anchored to the cytoplasmic membrane by negatively charged teichoic acids. It is also generally thought that

the discrimination between fungi and other eukaryotic host membranes is due to the different sterol compositions of their respective membranes [87].

Using chemically synthetized natural Cec variants and modified analogs, several studies have been performed to explain the Cec action mechanism against pathogens, as well as to identify the functions of specific residues within the peptide. Most mature Cec peptides contain a tryptophan residue in the first or second positions, which is considered important in conferring full antimicrobial activity to the peptide [5,7,84,88]. A study performed on Papiliocin, from the lepidopteran *Papilio xuthus*, suggested that the presence of tryptophan2 and phenylalanine5 aromatic residues in the N-terminal region are essential for the full-length peptide to interact with LPS in the outer membrane, and permeabilize the inner membrane of Gram-negative bacteria [58]. However, some dipteran Cecs, such as those from the black fly *Simulium bannaense* and the mosquito *Aedes aegypti* have been shown to be highly effective against different bacteria, although lacking an N-terminal tryptophan residue [22,25].

In several cases, Cec peptides undergo amidation of the C-terminal residue, a post-translational modification, which increases both antimicrobial activity and the action spectrum of the peptide [6,7]. It has been demonstrated that the antimicrobial activity of Cec AMPs relies on the structure they assume in the presence of bacterial cells. Circular dichroism analyses showed that in aqueous solution, Cecs have a random coiled structure but adopt α-helical conformations upon interaction with microbial membranes, where they exert a lytic effect [53,58,84,86]. Although some aspects remain unclear, it is currently accepted that Cec peptides do not interact with specific receptors but initially associate with the bacterial membrane along the axes of the α-helical domains parallel to the lipid bilayer surface. At this level, the polar residues of the peptide interact with the lipid phosphates, while the non-polar side chains burrow in the hydrophobic core of the membrane [84] (Figure 2B). In a first model of action, the continuous accumulation of peptides at the bacterial lipid bilayer leads to the formation of a peptide "carpet" on the membrane surface. This "carpet" structure possesses intrinsic detergent-like lytic properties, which disintegrate the membranes [84]. Cec P1 [14,15] and *H. cecropia* Cecs, when administrated at high concentrations (Cec P1 > 25 μM; *H. cecropia* Cecs > 5 μM), appear to act through this carpet-like mechanism (Figure 2B) [84,85]. However, at lower concentrations (2–5 μM)*, H. cecropia* Cecs are able to associate with membranes and form channels or pores, which affect cellular electrolyte balance and in turn cause the death of the microorganism (Figure 2B) [84–86]. Initially, it was postulated that the N-terminal amphipathic regions of the peptides were involved in the formation of the pore (called "type II channel"), with the positively charged residues forming the inner channel [89,90]. Subsequent authors have hypothesized that the C-terminal hydrophobic domains of the peptides insert into the membrane giving rise to a more stable pore (type I channel), in which the polar aa of the C-terminal helices are oriented toward the center of the pore [85,86,90]. Efimova and colleagues analyzed the effect of *H. cecropia* Cecs A and B in model lipid membranes, with or without small molecules capable of modifying the membrane physical-chemical properties [85,86]. Using these data, they developed a model in which Cec peptides first interact as monomers with the hydrophilic heads of the lipid bilayer surface, acting parallel to the membrane plane. Next, the peptides submerge their C-terminal hydrophobic domains into the phospholipidic hydrophobic chain. Individual Cec molecules then organize into oligomers forming ion-permeable pores in the cell membrane (Figure 2B). Other monomers can then insert into the pores, increasing the ion channels' conductance. The authors also postulated that all the steps of this process are reversible and in equilibrium [86]. This pore model therefore resembles the "barrel-stave" model, in which the different C-terminal regions of the *H. cecropia* Cec peptides are organized to form a barrel penetrating the bacterial membrane. However, in cases where the peptide is shorter than ~ 22 aa (e.g., synthetically Cec-derived analogs, see below), the structure of the pore might be more similar to the so-called "toroidal-pore" model, in which the pore is composed by both peptides and lipids [84].

As mentioned above, natural Cec and Cec-like peptides show a higher activity against Gram-negative compared to Gram-positive bacteria. This feature has been related to the difference in the intrinsic properties of bacterial membranes (i.e., lipid composition, charge density, and electrochemical potential across the membrane), as demonstrated when evaluating *H. cecropia* Cec B against protoplasts obtained from Gram-negative *Escherichia coli* and Gram-positive *Staphylococcus aureus* or *S. epidermidis* [46]. Moreover, a recent study on natural Papiliocin and its modified derivatives associated the Cec's preferential activity against Gram-negative bacteria specifically with the presence of the C-terminal helix. In fact, compared to the full-length natural form, a truncated Papiliocin carrying only the N-terminal portion was less effective against Gram-negative, and more active against Gram-positive bacteria [58].

Finally, in a study evaluating the interaction between different *B. mori* natural Cec B variants and live Gram-negative *Pseudomonas aeruginosa*, it was suggested that Cecs might first affect the outer bacterial membrane, enabling the translocation of the peptide to the inner membrane, resulting in the disorganization of both lipid bilayers [53].

#### **4. In Vitro Antimicrobial Activity of Natural Cecs and Synthetic Cec-Analogs**

Numerous basic research studies have shown that natural Cecs or synthetic Cec-analogs can have antibacterial, antifungal, antiviral, and antiprotozoal properties (Tables 1 and 2 and reference herein). Although there is a lack of uniformity among these studies, the peptides have generally exhibited a high in vitro activity against Gram-negative bacteria. These also included multidrug resistance (MDR) strains listed by the World Health Organization (WHO) in the three "critical, high and medium" priority groups, requiring the development of new antibiotics [91].




**Table 2.** *Cont*.

Peptide conc. (μM): Peptide concentration showing no or weak toxicity in mammalian cells; Cytotox.: Cytotoxicity; Hem act.: Hemolytic activity; sub.: Substitution; add.: Addition; del: Deletion; G+ Gram-positive bacteria; G−: Gram-negative bacteria; *Sa*: *S. aureus*; *Ec: E. coli*; *Pa: P. aeruginosa*; *Ab: A. baumannii*; A: Active against tested species; NA: Not active against the tested species; -: Not determined; (?): Not reported.

Cec peptides were effective against laboratory strains of *P. aeruginosa* and different *Enterobacteriacae* spp. (including *K. pneumoniae* and *E. coli*), bacterial species belonging to the WHO first critical group. *M. domestica* Mdc, black fly SibaCec, and dung beetle Oxysterlins were active against MDR and clinically isolated *E. coli* strains [19,33,108], while *H. cecropia* Cec A and *P. xuthus* Papiliocin efficiently killed MDR *P. aeruginosa* isolates [45,58]. Mdc and SibaCec were also active against reference strains belonging to *Acinetobacter baumanii*, also critical on the WHO list [22,35]. Lepidopteran *H. cecropia* Cec A, *P. xuthus* Papiliocin, Cec D from the mosquito *A. aegypti*, and different synthetic CAM hybrids (formed from the fusion of the N-terminal regions of *H. cecropia* Cec A and *Apis mellifera* Mellitin) were effective against MDR *A. baumanii* strains [25,45,58,95].

Several natural Cecs and Cec-analogs have also shown activity against the food-borne Gram-negative pathogen *Salmonella typhimurium*, included in the high priority group of the WHO list (e.g., [19,23,25,58,80,98]). In addition, some dipteran Cec AMPs, such as those from the mosquitos *Aedes albopictus* and *Culex pipens*, were active against *Francisella novicida*, a facultative Gram-negative bacterium used as reference species to model *F. tularensis*, a zoonotic pathogen causing tularemia in humans and animals [28].

It is important to note that, although natural Cecs and Cec-like peptides generally demonstrated an antimicrobial activity against Gram-positive bacteria such as *Bacillus* spp and *Micrococcus luteus*, the vast majority were not or weakly active against *S. aureus*, which belongs to the high priority group on the WHO list (an exception appears to be the horse fly Cec TY1, which is reported to be more active against *S. aureus* than *E. coli*; [29]). Interestingly, synthetic Cec-analogs were active against *S. aureus*. In particular, an anti-*S. aureus* activity characterized CAM peptides [94,96,98,99], and other chimeric hybrids, such as CA-MA or CA-LL37, obtained from the fusion of *H. cecropia* Cec A N-terminal fragments with portions of *Xenopus laevis* Magainin [102] or human LL-37 AMP [106], respectively (Table 2). Similarly, ΔM2 (a synthetic variant of *Galleria melonella* Cec D with modified residues in the N-terminal region; [54]) and Cec XJ forms (2-aa longer variants of *B. mori* Cec B; [107]) were also effective against *S. aureus* (Table 2).

Moreover, Cec D from the lepidopteran *G. mellonella* showed antibacterial activity against *Listeria monocytogenes*, a Gram-positive bacterium causing listeriosis, a food-borne infection, which can cause meningitis, meningoencephalitis, and fatal sepsis [54,109].

Several natural Cecs and analog derivatives have also been tested against a variety of fungi (Tables 1 and 2). Although the peptides were not all effective against these microorganisms, *H. cecropia* Cecs A and B [44], *P. xuthus* Papiliocin [58], *Artogeia rapae* Hinnavins [65,66], Cec A from the mosquito *Anopheles gambiae* [23], and a Cec-analog derived from the D-enantiomerization of *Antheraea pernyi* Cec B [93], were active against *Candida albicans*, an opportunistic pathogen responsible for candidiasis in human hosts [110]. Synthetic analogs also showed in vitro antiprotozoal activities, as demonstrated for SB-37 and Shiva, which were effective against *Trypanosoma cruzi* and *Plasmodium falciparum* [92], and a chimeric CAM hybrid active against *P. falciparum* [94] (Table 2). Finally, several Cec and Cec-analog peptides have also been tested for their potential antiviral activity (Tables 1 and 2). *H. cecropia* Cec A was able to suppress replication of human immunodeficiency virus 1 (HIV) by inhibiting viral gene expression [43], while Cec D was active against the porcine reproductive and respiratory syndrome virus (PRRSV) [47]. Additionally, engineered CA-MA hybrids were shown to inhibit virus–cell fusion activity [104].

#### **5. Anti-Inflammatory Properties of Natural Cecs and Synthetic Cec-Analogs**

Some Cec AMPs have been explored for their potential anti-inflammatory activity. Inflammation is an organism-protective response against different factors, including pathogens, which contributes to the removal of harmful foreign agents and to the initiation of reparative processes. An uncontrolled inflammatory response can however be dangerous, eliciting different acute or chronic diseases (reviewed in [111]). During Gram-negative infections, the release of LPS can overstimulate the innate immune system resulting in septic shock [112]. Several Cecs and Cec-analogs are able to bind LPS and have shown both in vitro and in vivo anti-inflammatory properties. Specifically, peptides derived from Lepidoptera (*H. cecropia* CecA [45], Papiliocin and derivatives from *Papilio xuthus* [57,58,113], Cec B, and a synthetic analog from *A. pernyi* [49]) were able to inhibit the production of nitric oxide and the transcription of several pro-inflammatory genes in LPS-treated murine cells, in vitro. Similar properties characterized natural Cecs from Diptera, such as Cec TY from the horsefly *Tabanus yao* [108], SibaCec from the black fly *S. bannaense* [22], and AeaeCec 1 from the mosquito *A. aegypti* [26]. In addition, an in vivo study showed that an intraperitoneal administration of *H. cecropia* Cecs A and B or a Papiliocin analog were able to reduce bacterial concentrations, plasma endotoxin levels, and mortality in *E. coli*-infected rodent models [113,114]. Finally, *M. domestica* Mdc was shown to alleviate colonic mucosal barrier impairments induced in mice by a *Salmonella typhimurium* infection, with a reduction in the colonic inflammation and oxidative stress response [115]. These studies demonstrate the dual antimicrobial and anti-inflammatory functions of Cec AMPs, underpinning their potential utilization in biomedical applications.

#### **6. Antitumor Activity of Natural Cecs and Synthetic Cec-Analogs**

Although the antitumor activities of Cecs and Cec-analogs have been less widely studied than their antimicrobial activities, these peptides indeed possess antitumor properties. These characteristics, for example, refer to *H. cecropia* Cecs A and B, *M. domestica* Mdc, *B. mori* Cec XJ derivatives, and the chimeric CAM and CA-MA hybrids, which were active against different types of human and rodent cancer cell lines in vitro [80,100,116–121]. Cec XJ and Mdc were also shown to inhibit proliferation and promote apoptosis of transformed cells in vitro [80,120]. Interestingly, when tested at the same concentrations, none of the analyzed AMPs showed any cytotoxic effects against normal cell lines. This selective antitumor activity might in part depend on the variable membrane compositions and fluidity of transformed compared to non-transformed cells [122]. Finally, Cec antitumor activity was also demonstrated in in vivo mammalian models, as shown for the *H. cecropia* Cec B and *B. mori*-derived Cec XJ, both improving the survival of mice bearing malignant ascites [117,123], indicating the potential of these AMPs as anticancer therapeutics.

#### **7. Health Benefits of Natural Cecs and Synthetic Cec-analogs: Future Potential and Limitations**

Several studies have suggested that some natural Cecs and synthetic-derived Cec peptides represent promising molecules for the development of new antibacterial drugs. Resistance to conventional antibiotics is a global phenomenon, involving not only the health system, but also livestock production [124]. The potential of insect AMPs as antimicrobial dietary supplements has been recently reviewed [125]. In addition, different studies reported the use of transgenesis to produce Cec-overexpressing plants and animals exhibiting greater resistance to pathogenic infections compared to non-transformed controls (e.g., [126,127]). Although effective, the use of transgenic strategies is limited by the regulatory laws of different countries and is not discussed in detail in this review. In the following paragraphs, we consider the potential of peptides belonging to the Cec family as therapeutics for clinical applications.

#### *7.1. Potential of Natural Cecs and Cec-analogs as Antibacterial Drugs*

Unlike other AMPs, Cec and Cec-analog peptides have generally shown low in vitro toxicity, evaluated as cytotoxicity against normal mammalian cell lines and/or hemolytic activity against human or rodent erythrocytes (Tables 1 and 2). Although there is variation among the analyzed Cecs, the peptide concentrations showing initial toxicity against mammalian cells were one or two orders of magnitude higher than the minimum inhibitory concentration (MIC) values against the analyzed bacteria. Interestingly, a low toxicity was also typical of different Cec chimeric hybrids, including some CAM, CA-MA, and CA-LL37 peptides [100,102,106], which generally showed a wide action spectrum against both Gram-positive and-negative bacteria (Table 2).

Several natural Cecs and synthetic derivatives have shown a high stability to heat treatments and/or pH variations (e.g., [53,98,107])**.** In addition, they usually maintained their antimicrobial activity in complex biological fluids, mimicked in vitro by using high concentrations of serum, as well as in the presence of elevated levels of divalent cations such as Ca2<sup>+</sup> and Mg2<sup>+</sup>, which show 1–2 and 0.5 mM concentrations in human saliva, respectively, and might reduce or inhibit AMP effectiveness (e.g., [37,53,107,128]). Similarly, natural Cecs and Cec-analogs were also active when analyzed in the presence of high concentrations of Na+, typical of airway surface fluids from patients affected by cystic fibrosis, who often suffer lung infections from bacteria such as *P. aeruginosa*, *A. baumanni*, or *S. aureus* (e.g., [49,53,98,102,129]).

The vast majority of data on Cec antimicrobial activity is derived from in vitro analyses. However, some studies have shown the potential of these peptides in vivo. For example, single intraperitoneal administrations of *H. cecropia* Cecs A and B, and *Danaus plexipibus* DAN2 decreased mortality in acutely *E. coli*-infected rodent models [114,130]. In addition, mice subjected to DAN2 doses two-fold higher than their most effective antibacterial concentration did not display any behavioral or morphological abnormalities, demonstrating in vivo that these peptides lack toxic effects after acute treatments [130].

With the prospect of employing natural Cecs and Cec-analogs in the treatment of infectious diseases, one of the potential problems is the capability of pathogens to develop resistance to these AMPs. Antimicrobial resistance is a complex phenomenon involving the development of intrinsic and/or acquired factors able to inactivate a compound or modify a target, nullifying the action of the specific drug. Currently, most considerations about, and data on AMP resistance in the literature, refer to bacteria. Although it is generally accepted that bacteria do not develop resistance to AMPs as easily as to conventional antibiotics, cases of bacterial resistance have been reported for non-Cec AMPs [125,131]. However, a recent study on *E. coli* compared the bacterial mutation rate induced by treatments with antibiotics with those with cationic AMPs, including *H. cecropia* Cec A [132]. Unlike antibiotics, none of the analyzed AMPs increased *E. coli* mutation rates. The authors linked this phenomenon to the inability of these AMPs to activate bacterial stress pathways that promote DNA mutagenesis [132]. Since the family of Cecs act against bacteria with a similar bactericidal mechanism at a molecular/cellular level, these data suggest that these AMPs are unlikely to stimulate the development of new intrinsic resistance factors linked to a mutation rate increment, at least in the *E. coli* model.

Long-term exposure to low levels of an antimicrobial compound is an important driver of antimicrobial resistance. Promising data have shown that following long-term treatments with the hybrid CAM peptide at sub-lethal concentrations did not significantly alter the peptide MIC. Following treatment, CAM remained effective against both laboratory reference and MDR *P. aeruginosa* strains, whereas similar serial exposures to sublethal doses of gentamicin or LL-37 increased their effective MICs on the same bacterial strains [97]. These studies provide important data suggesting that treatments with Cec and Cec-analog peptides do not easily induce antimicrobial resistance. However, dedicated studies analyzing all aspects of bacterial resistance, including the possible acquisition of exogenous

factors through horizontal gene transfer, should be performed for each promising Cec or Cec-analog antibacterial candidate.

An innovative approach that is gaining interest is the use of AMPs as adjuvants in combination with conventional antibiotics [133]. Simultaneous treatments of AMPs and antibiotics can determine synergistic antimicrobial effects that are able to increase therapy efficacy and lower administration doses, in turn decreasing potential toxicity side effects. This aspect should be evaluated for each Cec or Cec-analog candidate, since the indications derived from in vitro studies performed on Cec-analog peptides, such as CAM or Cec-LL37 hybrids, showed variable synergistic activity grades, depending on the types of antibiotics and bacterial species (e.g., [96,97,106,134]).

#### *7.2. Natural Cecs and Cec-Analogs as Anti-Biofilm Compounds*

Biofilms are bacterial communities embedded in an extracellular matrix of polysaccharides, proteins, lipids, and DNA [135]. The bacteria forming biofilms display numerous interesting emergent social behaviors but are less susceptible to the effectors of the human defense system and exhibit a higher tolerance to conventional antibiotics, conferred in part from the extracellular matrix [135,136]. Several bacteria responsible for infections in hospitalized and/or immunodepressed patients can form biofilms, including Gram-negative *P. aeruginosa*, *A. baumanni*, *K. pneumoniae*, and Gram-positive *S. aureus*, and *S. epidermidis* [135]. Estimates indicate that biofilm infections are associated with at least two-thirds of all clinical infections [136]. In humans, many surfaces can be infected by biofilms, such as skin, teeth, ears, bones, and the respiratory and urinary tracts. Biofilms can also grow on medical devices, such as artificial implants, valves, and catheters, frequently used in modern medicine as feasible solutions to rescue compromised organs. Medical devices are composed of different types of biomaterials, and a great effort has been made to develop safe biomaterials. However, biomaterial microbial colonization remains one of the major problems related to the use of such devices. Contaminated devices can cause biomaterial-associated infections that are difficult to treat with conventional antibiotic therapies, triggering severe consequences for patient health [137].

Innovative anti-biofilm treatments are therefore needed [136,137] and Cecs and Cec-analogs might represent a promising solution. Two in vitro studies have demonstrated anti-biofilm effects of CAM hybrids, alone and in combination with conventional antibiotics, to treat both *P. aeruginosa* and *S. aureus* biofilms [134,138]. In addition, the use of AMPs to coat biomaterials during device manufacturing is considered a promising strategy to prevent biomaterial-associated infections (reviewed in [137]). Different studies have explored the possibility of using Cec and Cec-analog peptides in the functionalization of several types of materials used in biomedicine, such as hydrogels [139], polyurethane surfaces [140], as well as silk fibroin films or fibers [141,142]. These peptide-enriched materials were able to inhibit the growth of *E. coli* [139,141,142] and *S. epidermidis* [140], supporting the potential of Cec and Cec-analog peptides in these applications.

#### *7.3. Biomedical Applications of Natural Cecs and Cec-Analogs: Limitations and Potential Solutions*

All potential treatments that aim to inhibit pathogenic infections as well as combat antimicrobial resistance suffer limitations in their overall efficacy, including AMPs. Peptides are subject to degradation by naturally occurring proteases, such as trypsin, which is abundant in the digestive tract, and trypsin-Cec degradation has been demonstrated in *B. mori* (Cec B and Cec XJ variants, specifically) [53,107]. Furthermore, Cec peptides can also be targets for human elastase, which is produced by neutrophils, defense cells recruited during infections. In addition, Cec AMPs might be inactivated by proteases secreted by pathogens, such as *Pseudomonas* elastase and *S. aureus* V8 protease [98]. However, AMP sensitivity to proteolytic degradation can be limited in a number of ways. The substitution of specific residues is one such method to inhibit proteolytic degradation; this was recently demonstrated in CAM peptides, where a four-tryptophan-substitution variant (CAM-W) lost susceptibility to degradation by each of the enzymes mentioned above (Table 2) [98]. Peptide stability against proteolysis can also be achieved by the substitution of the natural L-residues with their

respective D-enantiomers. This method was used to generate a whole D-enantiomer of the *A. pernyii* L-Cec B [93]. The obtained D-Cec B peptide maintained potent biocidal activity, while resisting the proteolytic activity that degraded the L-form (Table 2) [93].

In addition, enzymatic degradation might be limited by employing novel strategies based on the use of nanotechnologies. Indeed, the use of nanoparticles (NPs) to develop new formulations for AMP delivery is considered an improvement able to enhance peptide stability, while increasing peptide bioavailability and efficiency at the desired target site, as well as reducing the risk of possible toxic side effects [3,143]. In a recent study, Rai and colleagues demonstrated that the conjugation of CAM peptides to gold NPs enhanced in vitro CAM antimicrobial activity and stability as well as in vivo efficacy in a sepsis mouse model [144]. These encouraging results are opening new prospects for the use of Cec and Cec-analog peptides (and AMPs in general) as therapeutics to treat infectious diseases. In particular, the possibility of using biodegradable and biocompatible organic materials to encapsulate the peptide should be explored to give rise to new formulations for non- or less-invasive delivery routes (e.g., nasal, buccal/sublingual, or transdermal routes).

A second drawback that has slowed down the development of AMPs as new antimicrobial drugs is associated with the costs of large-scale production, which are generally much higher than those of small antibiotic molecules. Peptide compounds can be produced using a variety of techniques, including chemical synthesis, cell-free expression systems, recombinant DNA technologies for the production in heterologous cell systems, and transgenic organisms. Since natural Cecs and Cec-analogs generally show a low molecular weight (<4 KDa), chemical synthesis appears to be the best option for their production [145]. In addition, this technology allows the substitution of natural amino acids with atypical residues such as D-enantiomers, or the introduction of aa modifications (as in C-terminal amidation), often required in natural Cec and Cec-analog peptides. Chemical synthesis is undoubtedly an expensive approach [143]; however, due to the continuous development of efficient synthesis methodologies, progressive cost reductions for reagents, and competition among companies [6,145], considerable cost reductions are expected in the future. Consideration should also be given to the cost related to the development of possible AMP-based therapies compared to the social and economic burden caused by the current progressive and alarming spread of MDR infectious diseases [1]. Highlighting the USA as an example, 23,000 Americans are estimated to die annually with antibiotic resistant infections, while in 2018, direct national costs of treating antibiotic resistant infections have been projected to exceed \$2 billion annually [146]. To these costs, other indirect economic and social costs should be added.

#### **8. Conclusions**

Insect Cecs and Cec-analog peptides are a class of AMPs that appear to be promising candidates as antibacterial therapeutics. These AMPs, tested alone or in combination with conventional antibiotics, show powerful antimicrobial activity against several important human pathogens, including MDR bacterial strains. They also exhibit low toxicity against mammalian cells and anti-inflammatory activity. Preliminary indications suggest that the development of new resistance phenomena against these peptides appears unlikely. However, few preclinical and no clinical analyses have been performed to date. In particular, long-term and/or longitudinal studies exploring potential side effects such as allergenicity or immunogenicity should be completed [6].

The intrinsic nature of Cec peptides, which makes them sensitive to protease degradation, together with the cost of large-scale production has slowed down or even impeded the development of Cec-based antimicrobial drugs. However, the advance of new strategies such as nanotechnologies will considerably reduce these limitations. The use of natural Cecs might allow the production of formulations active against Gram-negative bacteria, while the employment of Cec-analogs might give rise to therapeutics with a wide spectrum, effective against both Gram-negative and Gram-positive pathogens. In addition, given their anti-biofilm activity, Cecs and Cec-analogs might be used to coat biomaterials for medical devices as a strategy to prevent biomaterial-associated infections. Although further research and development studies are required, several lines of evidence suggest that both insect Cecs and Cec-analogs represent a suitable tool to counteract the alarming global spread of MDR pathogens.

**Author Contributions:** Conceptualization of the manuscript, D.B., A.G., O.R., and F.S. Data analysis, A.G. and D.B. All authors wrote, read, and approved the manuscript.

**Acknowledgments:** We thank two anonymous reviewers for their useful comments on a previous version of the manuscript and Maxine Iversen for proofreading the final version of the manuscript. Federica Sandrelli acknowledges Cinchron, a European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie (grant agreement No 765937), CARIPARO (Progetti di Eccellenza 2011/12) and Università degli Studi di Padova (CPDA154301).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Whey-Derived Peptides Interactions with ACE by Molecular Docking as a Potential Predictive Tool of atural ACE Inhibitors**

#### **Yara Chamata 1, Kimberly A. Watson <sup>2</sup> and Paula Jauregi 1,\***


Received: 18 December 2019; Accepted: 27 January 2020; Published: 29 January 2020

**Abstract:** Several milk/whey derived peptides possess high in vitro angiotensin I-converting enzyme (ACE) inhibitory activity. However, in some cases, poor correlation between the in vitro ACE inhibitory activity and the in vivo antihypertensive activity has been observed. The aim of this study is to gain insight into the structure-activity relationship of peptide sequences present in whey/milk protein hydrolysates with high ACE inhibitory activity, which could lead to a better understanding and prediction of their in vivo antihypertensive activity. The potential interactions between peptides produced from whey proteins, previously reported as high ACE inhibitors such as IPP, LIVTQ, IIAE, LVYPFP, and human ACE were assessed using a molecular docking approach. The results show that peptides IIAE, LIVTQ, and LVYPFP formed strong H bonds with the amino acids Gln 259, His 331, and Thr 358 in the active site of the human ACE. Interestingly, the same residues were found to form strong hydrogen bonds with the ACE inhibitory drug Sampatrilat. Furthermore, peptides IIAE and LVYPFP interacted with the amino acid residues Gln 259 and His 331, respectively, also in common with other ACE-inhibitory drugs such as Captopril, Lisinopril and Elanapril. Additionally, IIAE interacted with the amino acid residue Asp 140 in common with Lisinopril, and LIVTQ interacted with Ala 332 in common with both Lisinopril and Elanapril. The peptides produced naturally from whey by enzymatic hydrolysis interacted with residues of the human ACE in common with potent ACE-inhibitory drugs which suggests that these natural peptides may be potent ACE inhibitors.

**Keywords:** ACE-inhibitory activity; whey peptides; molecular docking; hypertension

#### **1. Introduction**

The number of people with unhealthy living habits who have developed cardiovascular disease (CVD) has increased in recent years. The WHO reported that an estimated 17.9 million people lose their lives as a result of cardiovascular disease every year [1]. CVDs have become the leading cause of death globally [2]. High blood pressure (hypertension) is one of the most important well-defined risk factors for CVD [3], therefore, cardiovascular diseases can be prevented with blood-pressure lowering treatment. Hypertension is regulated by the renin-angiotensin system (RAS), through modulating the angiotensin-converting enzyme ACE, bradykinin and other factors [4–6].

ACE (dipeptidyl carboxypeptidase, EC 3.4.15.1) is a zinc metallopeptidase, found in male genital, vascular endothelial, neuro-epithelial, and absorptive epithelial cells [7–9], and displays both endopeptidase and exopeptidase activities, acting on a wide range of substrates [10]. ACE is a key enzyme for regulating blood pressure in the renin-angiotensin system. Renin cleaves the N-terminal segment of angiotensinogen from the biologically inert AT-1. ACE then hydrolyzes AT-1 by cleaving

the carboxyl terminal His-Leu dipeptide from the inactive AT-1 to the active angiotensin II (AT-2), a potent vasoconstrictor responsible for the development of hypertension [5,6,11,12]. ACE also indirectly influences the kallikrein–kinin system, by promoting the inactivation and degradation of the catalytic function of bradykinin, a vasodilator involved in blood pressure control [11–13]. By repressing AT-2 production and restraining bradykinin degradation, ACE inhibitory peptides control the increase of blood pressure [13].

Consequently, ACE-inhibiting natural products have been vigorously investigated during the last decades, due to their potential in lowering blood pressure during hypertension. Among various types of bioactive peptides, ACE-inhibitory peptides from food sources have been most extensively studied for their potential use as natural alternatives to drugs for reducing blood pressure through binding and inhibiting ACE, and thus preventing and managing hypertension [14,15]. Food-derived peptides are believed to represent a healthier and more natural alternative source for chronic treatment of hypertension. Moreover, and although the inhibitory capacity of food-derived peptides is lower than that of chemically-designed antihypertensive drugs, such as Captopril, Sampatrilat, Lisinopril, and Enalapril, it is thought that food-derived peptides are safer than pharmaceutical drugs due to their lack of some drug-associated adverse side effects such as angioedema, skin rashes, and dry cough [6,16]. However, considering the lack of consensus in their physiological antihypertensive effects in different human populations, the role of food peptides in regulating blood pressure is still a subject of ongoing debate [17–19].

Although different animal and plant proteins have been used in the development of functional foods providing antihypertensive activity, milk is the main source of antihypertensive ACE-inhibitory peptides reported to date [20]. Milk is made up of 3.5% proteins of which 80% are caseins, classified as α-, β- and k-caseins, and 20% whey proteins. Whey contains α-lactalbumin, β- lactoglobulin and other minor proteins. Upon the degradation of milk proteins, peptide fragments with many biological effects that can be different from those of the parent protein, are released. Several bioactive peptides in milk proteins have been identified [21], and they serve an array of biological activities, including angiotensin-converting enzyme (ACE) inhibition, antimicrobial, antioxidative functions, dipeptidyl peptidase IV (DPP-IV) inhibition, opioid agonist and antagonist activities, immunomodulation, and mineral binding [22]. Several milk/whey derived peptides possess high in vitro ACE inhibitory activity; particularly, hydrolysates of whey proteins, caseinates, fractions-enriched in individual milk proteins, and whole milk proteins have been reported to be a good source of ACE-inhibitory peptides [14]. Ile-Pro-Pro (IPP) has been identified as the most potent ACE inhibitor from milk protein, and it is derived from casein [23]. The antihypertensive activity of this tripeptide has been demonstrated in several animal studies and human trials [24]. However, in some cases, poor correlation between the in vitro ACE inhibitory activity of milk-derived peptides and the in vivo antihypertensive activity has been observed. This can be partly due to digestion which renders less active peptide sequences and/or due to their low bioavailability [25]. Also, antihypertensive activity may be exerted by mechanisms other than ACE inhibition [26,27], e.g., specific ACE inhibitors were demonstrated to increase the risk of microscopic colitis in a recent study, suggesting that milk-derived peptides may exert their antihypertensive activity through the microbiome [28].

The activity of these peptides depends on their inherent amino acid composition and sequence [29,30]. Shorter peptides up to 12 amino acids with hydrophobic and positively charged amino acids at the carboxyl end are more likely to interact with ACE [31]. In terms of favorable structure–function relationship for high ACE-inhibitory activity, dipeptides including bulky and hydrophobic amino acids are more potent whereas tripeptides having aromatic amino acids at the C-terminus end, positively charged amino acids in the middle and hydrophobic amino acids at the N-terminus end, are most potent [31]. Kobayashi et al. (2008) investigated the effects of aromatic amino acids in the third position of the tripeptides on ACE-inhibitory activity. They found that the difference in the ACE-inhibitory activity between the bioactive peptides (IKW, LKW, IKY, and LKF) resulted from the aromatic amino acids W, Y, and F. The highest inhibitory activity was presented by LKW, with the largest amino acid in the C-terminal. Accordingly, ACE-inhibitory activity is affected by the size of the amino acid, as well as its hydrophobicity [32]. In the same study, Kobayashi et al. (2008) examined the effects of the charged amino acid in the second position and they reported that in order to obtain a high inhibitory activity, it is essential to have a positively charged residue next to an aromatic residue. They also highlighted that the tripeptide sequence consisting of either I or L + positively charged amino acids + aromatic amino acids is likely to have high ACE-inhibitory activity. The charged amino acid takes part in binding by ACE while the bulky aromatic amino acid prevents the access between substrates and the active site of ACE [32]. Some studies have indicated that tri-peptides show higher ACE-inhibitory activity, and the C terminus end of the tri-peptides substantially affects binding to ACE. Hydrophobic amino acid residues or Proline residues at the carboxyl end are important for ACE inhibition and inhibitors containing these residues are resistant to digestion [33].

ACE inhibitors are commonly discovered using classic investigation techniques which include hydrolysis of proteins with different proteolytic enzymes, isolation and purification of peptides using chromatographic systems, and synthesis of corresponding peptides for the confirmation of activity and structure [8]. Although some structure-activity relationships have been established for food protein derived peptides, they are still quite generic and thus could not be solely used to predict the ACE inhibitory activity of peptide sequences [34]. In order to avoid some challenges of the classical approach, such as having to apply cumbersome purification processes to isolate active peptides, the computer-based approach is considered a useful and effective method to identify novel peptides [35]. A number of docking algorithms are being used in multiple studies to predict potent ACE inhibitory peptides encrypted in food proteins [36–42], and more specifically in milk proteins [43,44], whereby attempts to understand the interactions between receptor and ligand are being attempted [45]. Molecular docking enables the investigation of the specific interactions between certain peptide sequences and specific binding site residues in ACE, which could help to provide a better prediction of bioactivity in vivo through a molecular understanding of the structure-function relationship. Such an approach can be a powerful tool that can be used in pre-screening potentially bioactive peptides, prior to their testing in vivo.

Herein we investigate the potential interactions between whey protein derived peptides with high ACE inhibitory activity and human ACE, utilising a molecular docking approach. This study follows our previous work [30], where the peptides were produced by enzymatic hydrolysis of whey and were further fractionated for their chemical and activity characterization. Moreover, the interactions between these peptides and ACE are compared with those of the ACE inhibitory drugs Sampatrilat, Captopril, Lisinopril, and Enalapril.

#### **2. Results**

#### *2.1. Molecular Homology between Human ACE and Rabbit ACE*

According to the EMBOSS NEEDLE results (Figure 1), there is 93.4% of similarity between human ACE and rabbit ACE. As shown in Figure 1, the structural comparison of these two enzymes indicates that there is a close homology between the human ACE and the rabbit ACE, and that the active sites between human ACE and rabbit ACE are very similar. The rabbit ACE is generally used for the in vitro testing of ACE inhibition, hence it can be assumed that similar results will be obtained with human ACE. There have not been any previous studies that reported the homology between the human ACE and the rabbit ACE. In a study by Soubrier et al. (1988), amino-terminal sequence analysis was conducted between amino-terminal amino acid sequences of human ACE and other mammalians (rabbit, calf, pig, and mouse), and a high degree of similarity was found between human ACE and these mammalians [46].


**Figure 1.** EMBOSS NEEDLE multiple sequence alignment results. Colour coding is as follows: Yellow indicates identical residues at the active site, green indicates similar residues at the active site, and red indicates that a part of the residue is similar. (I) residues are identical; (.) conserved change; (:) part of the residue is similar but not that conserved.

#### *2.2. Molecular Docking*

Molecular docking was conducted to elucidate the potential molecular interactions between the whey-protein derived peptide sequences and specific amino acids at the binding site of human ACE. The peptide sequences were docked into the binding site of the human ACE, using the X ray crystallographic structure of the human ACE receptor (PDB code 6F9V). The extracted co-crystallized ligand, Sampatrilat, [47] was first re-docked into the prepared protein to be used for docking in order to validate the docking procedure. The RMSD between the docked conformation, as generated by the program PyMol, and the native co-crystallized ligand conformation was 0.1 Å, which was well within the 2 Å grid spacing used in the docking procedure, demonstrating that the docking method to be used was valid and reliable. Additionally, the interactions between the docked ligand and the prepared target receptor mimicked those observed in the crystal structure of the same protein.

Hydrogen bonds are a significant factor that contribute to the specificity and stability of protein-ligand interactions. Figures 2–5 and Tables 1–4 show the hydrogen bond interactions associated with each ligand and the surrounding ACE residues. IPP formed 3 hydrogen bonds with the ACE residues: one with Asp 354 and two with Gln 355 (Figure 2, Table 1). IIAE formed three hydrogen bonds with residues Thr 144, Gln 259, and Thr 358 (Figure 3, Table 2). LIVTQ formed three hydrogen bonds: one with Ala 332, one with Gln 355, and one with Thr 358 (Figure 4, Table 3). As for the ligand LVYPFP, five hydrogen bonds were formed with residues Asp 255, Ser 260, His 331, Arg 350, and Thr 358 (Figure 5, Table 4). It is interesting to note that several peptides had same H bonds in common: Thr 358 formed H bonds with three of the peptides, IIAE, LIVTQ, LVYPFP; Gln 355 with IPP and LIVTQ. Additionally, all except one of the aminoacids in the active site of the ACE were polar (charged and non charged) and some of these charged aminoacids were also involved in salt bridge (electrostatic)

interactions (Tables 1–4). IPP formed one salt-bridge interaction with residue Arg 350 (Figure 2, Table 1), whereas ligands IIAE and LIVTQ formed only one salt bridge interaction with residues Asp 140, and Asp 255, respectively (Figures 3 and 4, Tables 2 and 3). As for LVYPFP, two salt-bridge interactions were formed with residues Glu 262, and His 331 (Figure 5, Table 4). Both, LVYPFP and LIVTQ interacted with Asp 255 via H-bonding and a salt bridge, respectively; and, both IPP and LVYPFP interacted with Arg 350 via a salt bridge and H-bonding, respectively.

**Figure 2.** Docking results of the peptide IPP in the active site of human angiotensin I-converting enzyme (ACE). IPP is represented in red, interactions of human ACE residues with the peptide are indicated by arrows of different colours, with purple representing hydrogen bond interactions and blue arrows representing salt bridge interactions. The figure was generated using the software Maestro.

**Figure 3.** Docking results of the peptide IIAE in the active site human ACE. IIAE is represented in red, the interactions of human ACE residues with the peptide are indicated by arrows of different colours with purple representing hydrogen bond interactions, and blue arrows representing salt bridge interactions. The Figure was obtained using the software Maestro.

**Figure 4.** Docking results of the peptide LIVTQ in the human ACE active site. LIVTQ is represented in red, the interactions of human ACE residues with the peptide are indicated by arrows of different colours with purple representing hydrogen bond interactions, and blue arrows representing salt bridge interactions. The software Maestro was used for the generation of this figure.

**Figure 5.** Docking results of the peptide LVYPFP in the human ACE active site. LVYPFP is represented in red, the interactions of human ACE residues with the peptide are indicated by arrows of different colours with purple representing hydrogen bond interactions, and blue arrows representing salt bridge interactions. The software Maestro was used for the generation of this figure.

**Table 1.** IPP docking results.



**Table 2.** IIAE docking results.

**Table 3.** LIVTQ docking results.



**Table 4.** LVYPFP docking results.

#### **3. Discussion**

Hydrogen bonds interactions were demonstrated to play a crucial role in stabilizing the docked ligand complexes [48]. The distance of hydrogen bond interactions between the whey derived peptides and ACE amino acid residues typically were short (< 3.0Å; Tables 1–4), indicating that the peptides' binding affinity to ACE was strong [49]. In addition, these peptides formed a number of favorable salt bridge interactions with ACE residues, indicating that the ligands can pack tightly into the binding site and effectively inhibit ACE. Furthermore, it is interesting to note that hydrophobic amino acid residues such as proline, leucine, and isoleucine were mainly involved in establishing strong interactions with ACE, which goes in accordance with what is reported in SAR (Tables 1–4) [33].

Sampatrilat ((S, S, S)-*N*-{1-[2-carboxy-3-(N-mesyllysylamino) propyl]-1-cyclopentylcarbonyl} tyr-osine) (Figure 6) is a potent dual inhibitor of ACE and neutral endopeptidase. In the treatment of chronic heart failure, Sampatrilat could potentially provide a greater benefit than traditional ACE inhibitors [50,51]. In a recent study investigating the binding of Sampatrilat to the active site of ACE, the amino acid residues involved in the interactions with Sampatrilat were reported [47]. Interestingly, IIAE, LIVTQ, and LVYPFP interacted with three of these previously identified amino acid residues: IIAE interacted with residue Gln 259, LVYPFP interacted with residue His 331 and IIAE, LIVTQ, and LVYPFP interacted with residue Thr 358. Furthermore, previous studies stated that the ACE-inhibitory drugs Captopril, Lisinopril, and Enalapril interact with ACE amino acid residues Gln281, His353, Glu384, Lys511, His 513, and Tyr520 [52–54]. Apart from the amino acid residues in common, Lisinopril was reported to interact with Ala 354, Tyr 523, and Glu 162, and Enalapril to interact with Ala 354 and Tyr 523 [52,54]. According to the docking results, the peptides IIAE and LIVTQ interacted also with two of these residues: IIAE interacted with residue Asp 140 in common with Lisinopril, and LIVTQ interacted with Ala 332 in common with Lisinopril and Enalapril. (Amino acids residues are reported according to the Sampatrilat (PDB code 6F9V) amino acid sequence numbering, please see Table A1).

**Figure 6.** Chemical structure of Sampatrilat [47].

Overall, the docking results together with comparisons with the ACE inhibitory drugs provide strong evidence for the ACE inhibitory activity of IIAE, LIVTQ, and LVYPFP. In our previous work [31], IIAE, LIVTQ, and LVYPFP were identified as major peptides within fractions of high ACE inhibitory activity. Additionally, based on known structure-activity relationships, it was assumed that these were the main contributors to the ACE inhibitory activity measured. The docking results herein corroborate these assumptions and suggest that most probably these are potent ACE inhibitors that will contribute

to the ACE inhibitory and antihypertensive activity in vivo. Further work will be needed, using pure synthesized peptides, to confirm ACE inhibition and activity in vivo.

#### **4. Materials and Methods**

#### *4.1. Whey-Protein Derived Peptides*

In our previous work where we characterized angiotensin-converting enzyme (ACE) inhibitory peptides produced by enzymatic hydrolysis of whey proteins [31], peptide sequences were identified as major peptides in fractions from the enzymatic hydrolysates CDP (casein-derived peptides) and β-lactoglobulin. The well-known antihypertensive peptide IPP, along with some other novel peptide sequences that have structural similarities with reported ACE inhibitory peptides, such as Leu-Val-Tyr-Pro-Phe-Pro (LVYPFP), Leu-Ile-Val-Thr-Gln (LIVTQ), and Ile-Ile-Ala-Glu (IIAE) were characterized and identified by a combination of chemical characterization (LC/MS; MS/MS) and SAR data. Their ACE inhibitory activity is summarized in Table 5; the IC50 is defined as the peptide concentration required to reduce the ACE activity by half.


**Table 5.** IC50 (μg/mL) value of the ACE-inhibitory peptide sequences.

LIVTQ β-Lg 113 [31] \* IC50 of β-Lg hydrolysate containing this peptide as one of the major peptides.

LVYPFP Casein 97 [55]

#### *4.2. Homology between Human ACE and Rabbit ACE*

EMBL-EBI (https://www.ebi.ac.uk/) was queried for human and rabbit ACE amino acid sequences, together with known three-dimensional protein structures. Reviewed sequences were selected and the protein sequence files were downloaded. The accession codes for the human ACE and the rabbit ACE used in this work are P12821 and P12822, respectively. The two sequences were then uploaded to Emboss Needle (https://www.ebi.ac.uk/Tools/psa/emboss\_needle/) for multiple sequence alignment and comparison.

#### *4.3. Molecular Docking*

#### 4.3.1. Docking Validation

In order to validate the accuracy and the reliability of the docking procedure to be used in this study, the original ligand (extracted from the coordinate files and taken from the Protein Data Bank; PDB code 6F9V was docked into the corresponding crystal structure of the receptor, using the automated docking procedure in the program Surflex-Dock (SFXC) [56], as provided by SYBYL-X2.1. The docked ligand mode and orientation from the docking procedure were compared to that found in the actual crystal structure of the complex using Pymol and PDBeFold [57,58]. Following the docking procedure, the root mean square deviation (RMSD) between the docked ligand and the ligand, as found in the crystal structure, was calculated. The success of the docking process depended on whether the value of RMSD between the real and best-scored docked conformations were within the 2 Å grid spacing, used in the docking procedure [59], and whether the molecular interactions were replicated. In this case, Sampatrilat was docked into the human ACE receptor as validation of the docking procedure.

#### 4.3.2. Docking Procedure

Whey protein-derived peptides Ile-Pro-Pro (IPP), Leu-Ile-Val-Thr-Gln (LIVTQ), Ile-Ile-Ala-Glu (IIAE), and Leu-Val-Tyr-Pro-Phe-Pro (LVYPFP) were used as ligands in separate docking runs. Docking was performed using the docking algorithm Surflex-Dock, as provided in Sybyl-X 2.1. The X-ray crystallographic structure of sampatrilat-Asp in complex with Angiotensin-I-converting enzyme (PDB code 6F9V, 1.69 Å resolution) retrieved from the protein data bank (PDB) was chosen as the target protein for the docking studies, based on its high resolution structure co-crystallized with sampatrilat-Asp [47].

The Biopolymer Structure Preparation Tool, with the implemented default settings provided in the SYBYL programme suite, was used to prepare the protein structure for docking; hydrogens were added to the protein structure in idealised geometries, backbone and sidechains were repaired, residues were protonated, sidechain amides and sidechain bumps were fixed, stage minimization was performed, and all water and any ligand molecules were removed.

The three-dimensional (3D) structure of each ligand was constructed, using the "Build Protein" tool, as provided in Sybyl-X. Once constructed, charges were assigned to each atom of each molecule, using Merck Molecular Force Field (MMFF94) charges. Localized energy minimizations were then performed, and the final structure for each ligand in its lowest energy conformation was used for subsequent docking experiments. The resulting 3D coordinate files were converted to a MOL2 format for subsequent use in Surflex-Dock experiments, as provided in the SYBYL-X 2.0 software suite.

Surflex-Dock is a search algorithm that utilizes an empirically derived scoring function whose parameters are based on protein-ligand complexes of known affinities and structures. This method employs a "protomol", which is an idealized active site, as a target to generate presumed poses of molecules or molecular fragments. The protomol is employed as a mimic of the ideal interactions made by a perfect ligand to the active site of the protein. This molecular-similarity based alignment allows for optimization of potentially favorable molecular interactions, such as those defined by van der Waals forces and hydrogen bonds. In the present work, the protomol was defined by optimizing the threshold and bloat values to 0.5 and 0, respectively, to create a protomol that adequately described the binding pocket of interest. The extent of the protomol and its degree of coverage of an active site are controlled by these two parameters: the threshold value indicates the amount of buried-ness for the primary volume used to generate the protomol, and the bloat parameter determines the number of Ångstroms by which the search grid beyond that primary volume should be expanded. It is generally better to err on the side of a small protomol than on a protomol that is too large [60]. All parameters within the docking suite were left as the default values as established by the software [61,62]. Each peptide was then individually docked into the protomol site, using the "Docking Suite" application, as provided in the SYBYL programme suite. The docking results were visualised using the programme Maestro.

Molecular interactions, for the docking results, are reported according to the Sampatrilat (PDB code 6F9V) amino acid sequence numbering; for comparisons between different sequence numbering in studies referred here (See Appendix A (Table A1)). The software Maestro was used for the identification and characterisation of hydrogen bonds and salt-bridge interactions established between residues at the ACE active site and the peptides.

#### **5. Conclusions**

For the first time, reported herein, potential interactions between the naturally produced peptides from whey and ACE have been investigated, using a molecular docking approach. Peptides, IPP, IIAE, LIVTQ, and LVYPFP formed strong H bonds and salt bridge interactions with residues in the active site of human ACE. Moreover, a comparison with commercial ACE inhibitory drugs showed that the natural peptides interacted similarly to the drugs mimicking the same interactions with ACE active site residues. This study provides strong evidence for the ACE inhibitory activity of milk derived peptides, which have not been tested in vivo before. The results of this study, of novel milk derived whey peptides, could lead to the production of novel ACE inhibitors.

**Author Contributions:** Conceptualization, K.A.W. and P.J.; methodology, K.A.W. and P.J.; software, K.A.W.; validation, Y.C., K.A.W. and P.J.; formal analysis, Y.C., K.A.W. and P.J.; investigation, Y.C. and P.J.; resources, P.J.; data curation, Y.C.; writing—original draft preparation, Y.C.; writing—review and editing, K.A.W. and P.J.; visualization, Y.C.; supervision, K.A.W. and P.J.; project administration, P.J.; funding acquisition, P.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Table A1.** Comparisons between Captopril (PDB code 1O86) and Sampatrilat (PDB code 6F9V) sequence numbering.


#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **BIOPEP-UWM Database of Bioactive Peptides: Current Opportunities**

#### **Piotr Minkiewicz \*, Anna Iwaniak and Małgorzata Darewicz**

Chair of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszy ´nski 1, 10-726 Olsztyn-Kortowo, Poland; ami@uwm.edu.pl (A.I.); darewicz@uwm.edu.pl (M.D.)

**\*** Correspondence: minkiew@uwm.edu.pl; Tel.: +48-89-523-37-15

Received: 25 October 2019; Accepted: 25 November 2019; Published: 27 November 2019

**Abstract:** The BIOPEP-UWM™ database of bioactive peptides (formerly BIOPEP) has recently become a popular tool in the research on bioactive peptides, especially on these derived from foods and being constituents of diets that prevent development of chronic diseases. The database is continuously updated and modified. The addition of new peptides and the introduction of new information about the existing ones (e.g., chemical codes and references to other databases) is in progress. New opportunities include the possibility of annotating peptides containing D-enantiomers of amino acids, batch processing option, converting amino acid sequences into SMILES code, new quantitative parameters characterizing the presence of bioactive fragments in protein sequences, and finding proteinases that release particular peptides.

**Keywords:** bioactive peptides; database; proteolysis; SMILES code; foods; nutrition; chronic diseases; nutraceuticals

#### **1. Introduction**

The BIOPEP-UWM database is freely-accessible without registration at the following website: http://www.uwm.edu.pl/biochemia/index.php/pl/biopep. Recently, bioinformatic databases and software represent basic tools in the research on biologically active peptides, e.g., those derived from food. Their role was described in several reviews [1–7]. The BIOPEP-UWM™ (formerly BIOPEP) database of bioactive peptides is one of these tools. It has been available on the internet since 2003. Its previous versions have been described in publications by Minkiewicz et al. [8] and Iwaniak et al. [9]. The database has recently been widely used in food and nutrition science as a source of information about peptides being in the focus of interest as putative components of functional foods involved in the prevention of chronic diseases [5,7,10]. Over 350 articles are available that describe results that had been obtained, verified, or interpreted with the help of the BIOPEP-UWM database of bioactive peptides (excluding these contributed by database curators). Links to the BIOPEP-UWM™ database are recently available via such websites as MetaComBio [11], LabWorm, and OmicX. Information about peptides from the database is integrated into the SpirPep [12] and FeptideDB [13] databases.

The BIOPEP-UWM™ database is continuously updated and modified. Several new options have been introduced since the publication of the last article describing it [9]. The aim of the present publication is to provide information helpful in work with the current version of the database and associated tools, including the use of new options introduced in the last three years.

#### **2. Database Organization**

The scheme of organization of the BIOPEP-UWM homepage is presented in Figure 1. The screenshot of the homepage is available in Supplementary Figure S1. Apart from a database of bioactive peptides described in this article, the BIOPEP-UWM contains databases of proteins, allergenic

proteins, and their epitopes [14] as well as sensory peptides and amino acids [9]. The homepage also has a tab that allows users to submit new peptide sequences (not annotated yet in the database) or new activities (not annotated) of the existing peptides (See Supplementary Figure S2), and also a new BIOPEP-UWM news tab (not indicated in Figure 1).

**Figure 1.** Scheme of organization of the BIOPEP-UWM database of bioactive peptides.

The "bioactive peptides" tab links with the list of bioactive peptides (Supplementary Figure S3). Access to more detailed information about a particular peptide sequence is available via the "peptide data" tab attributed to each peptide. The page with a peptide list contains links to associated tools enabling the processing of peptide and protein sequences (via the "analysis" tab). Scrolling down using the bar left from the table (Supplementary Figure S3) opens the window, which allows the input of queries, enabling a search.

#### **3. Enlarging the Number of Peptides in the Database by BIOPEP-UWM**™ **Users**

The BIOPEP-UWM database is a curated database. Although it is regularly enriched with the new peptides, it is rather impossible to insert all bioactive peptides that are continuously being found in the literature. Thus, the "submit new peptides" option (see the BIOPEP-UWM homepage; Supplementary Figure S1) enables users to send us a peptide sequence not found in our database so far. The peptide sequence to be added to BIOPEP-UWM has to be provided in a one-letter code by pasting it to the window that appears after clicking the "submit new peptides" tab. All peptides sent this way are verified by our curators and can be uploaded to the database on condition that the sender had provided e-mail and reference data (i.e., details of an article the peptide was published in). Providing the senders' address enables generating an automatic e-mail confirming that the peptide of interest was successfully submitted by the user to the BIOPEP-UWM database. Publication details are needed to verify the information sent. The lack of the sender's e-mail as well as reference data on peptide to be inserted to BIOPEP-UWM (mandatory fields for successful submission) makes the submitted

information incomplete and may temporarily eliminate the sequence from the process of uploading it to our database.

#### **4. Peptide Information**

The current layout of peptide information in the BIOPEP-UWM database has earlier been used in the database of sensory peptides and amino acids [9]. Its implementation into the database of bioactive peptides is still in progress. Information about an example peptide with a GHS sequence (BIOPEP-UWM ID 9473) [15] is presented in Table 1. The screenshot of a peptide page is presented in Figure S4.


**Table 1.** Content of a page of a representative peptide.

The ID number is the first piece of information displayed on a peptide page. A peptide with a single activity annotated in the BIOPEP-UWM possesses one ID number. A peptide annotated as multifunctional possesses more ID numbers. The representative GHS peptide is annotated in the BIOPEP-UWM database twice, i.e., as an inhibitor of renin (EC 3.4.23.15) and angiotensin-converting enzyme (EC 3.4.15.1) [15] (ID 9472 and 9473, respectively). Database ID may serve as an unambiguous identifier of a compound, e.g., peptide. Examples of using ID numbers from peptide databases (e.g., BIOPEP-UWM) as peptide identifiers may be found in, e.g., recent publications of Skrzypczak et al. [16] and Khazaei et al. [17].

Name is the second piece of information on the page of an individual compound. Peptide names are often identical to their activity (e.g., ACE inhibitor). Some well-known peptides possess their own names, e.g., soybean lunasin (BIOPEP-UWM ID 9525 and 9526), the role of which has been reviewed by Hsieh et al. [18].

Peptide sequences are annotated in the BIOPEP-UWM database of bioactive peptides using a standard one-letter code describing 20 protein amino acids and their d-enantiomers (a recently added option). A peptide with the FhL sequence (l-Phe-d-His-l-Leu) [19] (BIOPEP-UWM ID 9475) may serve as an example of the peptide containing D-amino acid residue. The database offers an opportunity to annotate C-terminal amidation using the "~" symbol [8]. This symbol is, however, not universal. The EROP-Moscow database [20] uses the "z" symbol for the same purpose. There is also an opportunity (not exploited to date) to annotate phosphoserine using "B" and "b" symbols for l- and d-enantiomer, respectively. InChIKey is an unambiguous chemical identifier [21]. It always contains 27 characters and is sufficient for search via both search engines of chemical databases and common search engines such as GoogleTM. InChIKey is used as a name in the case of some compounds annotated in the PubChem database [22].

Information about the biological activity is inserted as activity (short version), activity code (abbreviation of activity), and function (more detailed version). The current list of activities of peptides found in the BIOPEP-UWM database of bioactive peptides is provided in Table 2. The list of bioactivities has been rearranged as compared to this published in 2008 [8] to remove redundancy (e.g., remove synonymous or extremely rare activities). On the other hand, several new activities, especially these concerning inhibition of enzymes, have recently been added. Annotation of bioactive peptides as compounds interacting with individual enzymes is preferred by users of the BIOPEP-UWM database, as in the case of, e.g., renin inhibitors [23–25]. Information concerning the role particular enzymes play in metabolic pathways has recently become available in specialized databases [26].

The peptide entry page also provides the chemical (average) and monoisotopic molecular mass of the peptide and a reference describing its given activity.

Completion of the contents of "additional information" and "database references" tabs is in progress. The "additional information" tab includes peptide structure written using chemical codes called SMILES [27]—the most popular chemical code, and InChI—recommended by IUPAC [21]. These codes represent a typical language of cheminformatics (i.e., chemical informatics) [26]. Cheminformatics is considered as an emerging method in food science [28,29]. SMILES and InChI codes, as well as InChIKeys, are used as input data for the search of molecules in chemical databases [26,30]. The supplement to our previous review [31] may provide insights on how much information about peptide bioactivity is presented in chemical databases. InChIKey is sufficient to search via common search engines such as GoogleTM. This option enables, e.g., finding peptides annotated in the BIOPEP-UWM database. There are many types of software that enable predicting the physicochemical and biological properties of chemical compounds and using, e.g., SMILES. This code may be converted into more than one hundred formats used in chemical informatics, for instance, by OpenBabel software [32]. Examples of using programs which require chemical codes as input data for in silico analysis and prediction of properties of food peptides have been recently presented by Ortiz-Martinez et al. [33], Mojica et al. [34], and Yu et al. [35]. Amino acid sequences are converted into SMILES code using applications available in the BIOPEP-UWM database via the "analysis" tab. Conversion of SMILES representations into InChI and InChIKeys is performed using OpenBabel or MarvinSketch software.

A peptide with a C-terminal amide group cannot be found in protein sequences. Precursors of these peptides, containing C-terminal glycine residues, are thus added. The mechanism of amidation includes the substitution of C-terminal glycine with an amide group [36]. A peptide with ID 2580 may serve as an example of this type of annotation. It is a precursor of antibacterial peptide [37] annotated as ID 2579. Information about amidation is provided in the "additional information" tab of a peptide, being a precursor of the amidated form (in the above example, peptide annotated as ID 2580).

The "additional information" tab also contains brief information about activities of the peptide taken from the BIOPEP-UWM database of bioactive peptides and other databases as well as information about peptide taste from the BIOPEP-UWM database of sensory peptides and amino acids [9].

Information about food resources and products, different values of IC50 are also included in the "additional information" tab for some of the peptides.


**Table 2.** List of activities of peptides annotated in the BIOPEP-UWM database of bioactive peptides.


**Table 2.** *Cont.*

<sup>1</sup> More information concerning enzymes inhibited by peptides is available in the following databases: ExplorEnz [38], BRENDA [39], ChEMBL [40], and MEROPS [41]. Information about associations between abnormal enzyme activity and diseases may be found in the OpenTargets database [42]. <sup>2</sup> Activities absent in the version described in our publication from 2008 [8].

The "database reference" summarizes databases providing information about a given peptide (for example, see Table 1). The list of databases most commonly cited in the above tab is presented in Table 3. The list has been significantly enriched since the publication of our previous article describing the database [9]. ID numbers of peptides are also provided in particular databases. Some databases (such as ACToR [43] or ChemIDPlus [44]) use CAS registry numbers as compound identifiers. The databases are available via the MetaComBio website [11] or the "useful links" tab on the BIOPEP-UWM website. The list of databases cited has been significantly enlarged since 2016 (Table 3).

The last tab "screen and print peptide data" summarizes all data concerning a given peptide. Supplementary Table S1 is copied directly from the above tab. In the supplement to our previous publication [9], we have pointed out the opportunity for providing links to this tab from other resources. Examples of such links are available in the supplement to our review concerning taste-affecting peptides [31]. Here we offer the opportunity to construct links to peptide pages ("activity" tabs). The data of example peptide (ID 9473) can be found at the following address: http://www.uwm.edu.pl/ biochemia/biopep/peptide\_data\_page1.php?zm\_ID=9473. ID at the end of the address (ID = 9473) may be replaced by another one to generate a link to another peptide data. An analogous link to a representative sensory peptide is as follows: http://www.uwm.edu.pl/biochemia/biopep/sensory\_data\_ page1.php?zm\_ID=2.


**Table 3.** Databases cited on the "Database reference" page and other bioinformatic tools mentioned in the publication.

<sup>1</sup> Accessed in July and August 2019. <sup>2</sup> Tools cited in our previous publication [9]. \* No reference available.

#### **5. Search Options**

Search options are summarized in Table 4 and Supplementary Figure S5.



*Int. J. Mol. Sci.* **2019**, *20*, 5978

Search options available in the BIOPEP-UWM database of bioactive peptides fall into the following major categories: text-based (ID, name, activity, reference, and InChIKey), structure-based (sequence-based), and property-based (number of amino acid residues and molecular mass). They are typical of peptide databases [4]. The use of an ID number as a query is the first search option. A single ID number corresponds to a single peptide with one defined activity. Search by name or by activity offers two possibilities to the user: finding all names or all activities including the chosen word or text fragment or exact search (see Supplementary Figure S5). The first opportunity leads to finding more peptides that fulfill the search criterion. Using the word "hemorphin-7" as a query, we can find four peptides (ID 2570, 2973, 3079, and 9001) without using the exact search option and only one (ID 3079) using the exact search option (search performed on 30 August 2019).

The search menu contains a link to the list of activities (Supplementary Figure S5), which serve for a query choice. In contrast to Table 2, the bioactivities are listed in the chronological (not alphabetical) order. Again, it is possible to use the exact search option. Using the word "inhibitor" as a query without using the exact search option has given a list of 1552 peptides as an output (30 August 2019). The list contains all inhibitors of enzymes (e.g., ACE, dipeptidyl peptidase IV, and dipeptidyl peptidase III). The exact search option with the same query found only 67 peptides with the activity annotated as "inhibitor" (see Table 2).

InChIKey is the most typical identifier of compounds (e.g., peptides) in chemical databases (e.g., PubChem [22]; ChemSpider [54], and ChEMBL [40]). Although it is a unique identifier of any chemical compound, it does not provide information about its structure [21]. InChIKeys in the BIOPEP-UWM database correspond to linear peptides with all chirality centers defined, acidic and basic groups electrically neutral, and cysteine residues reduced (if any in the peptide sequence). Incomplete InChIKey used as a query may result in finding more peptides. For instance, a "DYKIIFRCSA-N" fragment occurs in three InChIKeys corresponding to the celiac toxic peptide with the sequence PSQQQP (ID 2578), ACE inhibitor GPAGAPGAA (ID 3363), and antibacterial peptide ALCSEK (ID 4011). These peptides have no common fragments (subsequences). The use of incomplete InChIKey with the exact search option will fail to produce any results.

The sequence-based search is the most common and most intuitive option used to find peptide information in the database [4]. The BIOPEP-UWM database offers an opportunity to find all longer sequences containing a query fragment and to find a given sequence (exact). The first opportunity allows user to find peptides containing a defined continuous motif, e.g., attributed to the given function [74,75]. This search option also follows the fragmentomics concept [76]. It assumes that shorter (functional) bioactive subsequences present in a sequence may be crucial for the biological activity of the entire peptide molecule (peptide). Examples of peptides inscribing into this concept may be found in the BIOPEP-UWM (e.g., hemorphins or ACE inhibitors from caseins) and in other peptide databases such as EROP-Moscow [20], PepBank [67], SATPdb [70] or AHTPDB [45]. The exact search option is sufficient to check the bioactivity of peptides identified among protein hydrolysis products. An example of such an experiment has recently been described by Martini et al. [77] and Garcia-Vaquero et al. [78].

In the case of the property-based search (involving the number of amino acid residues or molecular mass range), choosing the exact search option does not change the output. We generally recommend using the exact search option for the sequence-based search.

#### **6. Analysis**

The "analysis" page includes the following tabs: "profiles of potential biological activity", "calculations", "enzyme(s) action", "find", "batch processing", "definitions", "SMILES", and "find the enzyme for peptide release" (Supplementary Figure S6).

The profile of a potential biological activity is defined as the type and location of bioactive fragments in a protein or a peptide chain [79]. This idea is based on the assumption that the same bioactive fragment, especially a short one (2–3 amino acid residues), cannot be attributed to a given protein, but may be present in many sequences (many form the so-called common subsequences) [75,79]. The concept of profiles of the potential activity of peptide fragments is consistent with the fragmentomic approach proposed by Zamyatnin [76] (see above). The profiles of potential biological activity of proteins can be obtained using the asterisk by default. Examples of published profiles of the potential activity of peptide or protein fragments may be found in publications of Bauchart et al. [80], Huang et al. [81], Tapal et al. [82], Khazaei et al. [17], and Jakubczyk et al. [83]. The profile may also be constructed for the specific bioactivity (bioactivity of interest) when selecting the activity instead of an asterisk from a toolbar. The menu to be used for the construction of potential biological activity profiles is shown in Supplementary Figures S7–S9. The profile of a potential biological activity of a protein or a peptide sequence is presented as a table including the following columns: ID, name of peptide, activity, number of repetitions of a particular bioactive fragment in a query sequence, sequence of the bioactive fragment, and location of the bioactive fragment in a query sequence. An example of the above profile is presented in Supplementary Table S2.

The "calculations" tab enables calculating two quantitative parameters that characterize proteins as potential precursors of bioactive peptides: the frequency of bioactive fragments occurrence in a protein sequence (A) and a potential biological activity of protein fragments (B). Equations 1 and 2 enabling calculation of the above parameters are provided in Table 5. The menu of the "calculations" tab is shown in Supplementary Figure S10. An example of the output is presented in Supplementary Table S3. The frequency of bioactive fragments occurrence in a protein sequence (A) is calculated for all bioactive peptides present in the query sequence (using the asterisk) or for one specific peptide (by choosing the bioactivity from a toolbar). Potential biological activity of protein fragments (B) may be calculated only if peptide IC50 or EC50 is available. The program skips peptides without known IC50 or EC50 value. For instance, Supplementary Table S3 provides B values for ACE and DPPIV inhibitors only. In the case of other activities, B values have not been calculated due to the lack of IC50 or EC50 attributed to particular peptides. Articles published by Udenigwe et al. [84] and Lin et al. [85] contain representative results of calculations of quantitative parameters characterizing food proteins as potential precursors of bioactive peptides.


**Table 5.** Quantitative parameters characterizing proteins as potential precursors of bioactive peptides, available in the BIOPEP-UWM database.


**Table 5.** *Cont.*


**Table 5.** *Cont.*

<sup>1</sup> available via the "profiles" tab and "batch processing" tab. <sup>2</sup> available via the "enzyme (s) action" tab and "Batch processing" tab. <sup>3</sup> available via the "batch processing" tab only. <sup>4</sup> not displayed among the results. Shown only to explain the calculation of other parameters. \* New parameters described for the first time in this publication. Some of them have been announced in [4].

The "Enzyme(s) action" tab allows simulating proteolysis catalyzed by endopeptidases. The scheme of steps required to obtain the peptides potentially released by a given enzyme (or enzymes) is presented in Figure 2. Screenshots of menus of particular tabs are presented in Supplementary Figures S11–S16. The menu also enables enzyme choice (Supplementary Figures S12 and S13). It allows the simulation of proteolysis using one to three enzymes. Example information about a single enzyme (plasmin; EC 3.4.21.7; MEROPS ID: S01.233) is presented in Supplementary Figure S14. The enzyme is annotated using a connection ID, indicating a single peptide bond hydrolyzed by the enzyme and enzyme ID. One enzyme may cover few connection IDs (in the case of plasmin—two). Enzyme specificity is described using two terms: a recognition sequence understood as a fragment of an amino acid sequence recognized by the proteolytic enzyme and a cutting sequence understood as an amino acid residue preceding or following the bond hydrolyzed by protease [8]. The recognition sequence may contain a single amino acid residue (e.g., for plasmin) or a longer fragment such as for a ginger protease—zingipain (EC 3.4.22.67; MEROPS ID: C01.017). Annotations "C-terminus" and "N-terminus" indicate bonds formed by a carboxyl and amine group of an amino acid residue, respectively, hydrolyzed by the enzyme. Data concerning particular enzymes contain references: databases such as MEROPS [41] and CutDB [56] or publications (Bastian and Brown [89] for plasmin and Huang et al. [90] for zingipain). Apart from the addition of new enzymes, the specificity has recently been modified for some of the existing ones. The modification included the addition of new recognition sequences and cutting sequences (possessing connection IDs within the range 141–184), according to data presented in the so-called specificity matrices in the MEROPS database. These matrices are continuously updated to follow newly appearing information about new sites susceptible to proteolysis in protein sequences [41]. Proteolysis simulation is simplified. It assumes that all bonds theoretically susceptible to a given proteinase are hydrolyzed. In real experiments, the proteolysis is often incomplete. This finding may explain false-positive results, i.e., lack of expected peptides. False-negative results may be explained by incomplete knowledge about proteolytic specificity, i.e., the situation when some bonds susceptible to the proteolytic enzyme are considered resistant. The addition of new recognition and cutting sequences to the enzyme data aims to minimize the occurrence of false-negative results.

**Figure 2.** Scheme of the "enzyme(s) action" tab. Option (see Figure 1) "search for enzyme with given specificity" is not included in the Figure. A screenshot of the menu of this tab is presented in Supplementary Figure S11.

Results of simulated proteolysis of an example peptide can be found in Supplementary Figure S15. Displayed results of the initial step of simulation include sequences of peptides being products of proteolysis and their location in the precursor sequence. The next step may include the search for bioactive peptides among products of simulated proteolysis or calculation of quantitative parameters characterizing the proteolysis (Figure 2 and Figure S15). The parameters available via the "enzyme(s) action" tab are calculated according to Equations (3)–(7) from Table 5. Representative results of the search for active peptides among simulated proteolysis products and calculation of quantitative parameters are presented in the Supplement (Tables S4 and S5, respectively). Calculation of parameters BE and V involves EC50 or IC50 values. If they are not available, peptides are not taken into account. Simulation of proteolysis using the BIOPEP-UWM database has recently been described by, e.g., Lin et al. [85], Yu D. et al. [91], and Kandemir-Cavas et al. [92]. Data concerning proteolysis simulation may be interpreted together with protein structures [93].

A new tab named "search for enzymes with given specificity" enables the search for information about an enzyme using recognition sequence, cutting sequence, and choice between C- and N-terminus (bond formed by carboxyl or amine group of amino acid residue, respectively). Results include the list of enzymes with a given specificity. An example of a query and result produced using the above option may be found in Supplementary Figure S16 and Supplementary Table S6. For most of the enzymes, the recognition sequence contains only one amino acid residue.

The content of the new "find" tab enables quickly finding of some information in protein and bioactive peptide databases. Particular tabs enable display of a full list of protein sequences annotated in the BIOPEP-UWM database, a full list of peptides revealing a given activity, and a list of all proteins or peptides containing the query sequence (see Supplementary Figure S17). The last option enables finding all proteins or peptides containing a given bioactive fragment or a recognition sequence available for the proteolytic enzyme. An example result of a search for a VPP sequence in the database of bioactive peptides is presented in Table S7 (Supplement). Results cover links to peptide or protein data, ID number, name, and sequence.

Another new "batch processing" option serves for the simultaneous processing of a set of few sequences of proteins or peptides being potential precursors of bioactive peptides. The total length of all sequences forming the query set may be up to c.a. 1500 amino acid residues. The scheme of activities available via this option is presented in Figure 3. The screenshot of the input window is available in Supplementary Figure S18. The FASTA format [94] is used to input a set of sequences. The "batch processing" option enables performing any action available via the tabs: "profiles of potential biological activity", "calculations", and "enzyme(s) action". Moreover, there are new parameters characterizing the occurrence and possibility of enzymatic release of an individual peptide from few precursor sequences (aT, aS, AS, aTE, aSE, ATE) calculated according to Equations (8)–(10) and (12)–(14) in Table 5. Distribution of particular fragments in the set of sequences may be in the focus of scientific interests when using in silico methodologies [76,77,95]. Analysis may cover all possible or selected options. Supplementary Figure S18 shows a set of sequences ready for an analysis concerning bioactive peptides (excluding options concerning data from the database of allergenic proteins). The batch analysis is

performed in two steps (Figure 3). The first step may be performed for all activities (default option) or a selected one. The second step may be performed after the first one had been completed. The parameters may be calculated for all bioactive fragments found in the set of sequences or for manually selected peptides. Results of the first and the second step of batch analysis are presented in Supplementary Tables S8 and S9, respectively.

**Figure 3.** Scheme of the "batch processing" tab action.

The "definitions" tab summarizes terms and definitions used in the BIOPEP-UWM database including equations used to calculate quantitative parameters, as shown in Table 5.

The "SMILES" tab, introduced in 2018, enables translating amino acid sequences (written using standard one-letter code) into the chemical language "SMILES". SMILES representations are built according to a simplified algorithm described by Siani et al. [96]. SMILES codes of particular amino acid residues are written using the same layout as used in the SwissSidechain database [72] and source codes of the CycloPs program [97] (program temporarily unavailable). The procedure was tested and verified according to recommendations proposed in our previous publication [98]. The MarvinSketch 17.28 software (ChemAxon, Budapest, Hungary) was used to test and verify SMILES strings of peptides. The application utilizes the sequences of peptides built from 20 protein amino acids, their D-enantiomers, L- and D-phosphoserine (Symbols B and b, respectively), and C-terminal amide group. It is easy and fast in use and can process linear peptides only. Disulfide bonds and other modifications may be inserted using molecule editors (e.g., MarvinSketch, Dendrimer Builder program provided by the University of Bern, Switzerland, and molecule editor of the NANPDB [65] database) which may serve as alternatives to our application. The first one may be used to construct any molecules from building blocks drawn or imported as SMILES strings, the second, to build representations of branched peptides containing some non-protein amino acids, whereas the third to encode pyrrolysine and selenocysteine apart from 20 most common protein amino acids. Our application converts amino acid sequences into the so-called aromatic SMILES. Some search engines do not utilize this version [30]. The aromatic version of the SMILES string may be converted into an alternative, so-called Kekule version using, e.g., the molecule editor of the PubChem database [99] or MarvinSketch software. Screenshots of the "SMILES" tab window with query and result are given in the Supplementary Figures S19 and S20, respectively. Two types of SMILES representations of the example peptide may be found in Supplementary Table S10.

The way of understanding the output information when using the new tab entitled "find the enzyme for peptide release" is summarized in Figure 4. The screenshot of the menu of this tab and representative results are shown in the Supplementary Figure S21 and Supplementary Table S11, respectively. The input includes peptide sequences provided in FASTA format and the precursor (protein or peptide) sequence. The output includes a list of all enzymes with the specificity sufficient to catalyze particular proteolytic events. A proteolytic event is understood as a case of cleavage of

an individual peptide bond. This term has been introduced in the CutDB database [56]. Release of a peptide from the precursor sequence requires two proteolytic events: cleavage of bond preceding Nand following C-terminus (indicated in Supplementary Table S11 as N and C, respectively). If a given peptide appears in the precursor sequence more than once, then the particular events attributed to this peptide are indicated as 1N, 1C, 2N, 2C, and so forth. An example peptide with the AP sequence occurs in the precursor sequence RWAFAPGFAPGHIP twice (positions 5–6, 9–10). Its release is associated with four proteolytic events: 1N—cleavage of the bond between the residues 4 and 5, 1C—cleavage of the bond between the residues 6 and 7, 2N—cleavage of the bond between the residues 8 and 9, and 2C—cleavage of the bond between the residues 10 and 11. Displayed results concerning enzyme catalyzing the particular proteolytic events cover the following data: name, EC number, enzyme ID in the BIOPEP-UWM database, connection ID, cutting sequence, and recognition sequence. The cutting sequences are described using the symbols "+" and "−" assigned to the amino acid symbols. Symbol "+" means that the amino acid residue follows the cleaved bond, i.e., this bond is formed by the amine group. For example, the symbol "A+" means that the cleaved bond is formed by the amine group of alanine. The symbol "−" means that the amino acid residue is located before the cleaved bond, i.e., this bond is formed by the carboxyl group of the amino acid. For instance, the symbol "W-" means a bond formed by the carboxyl group of tryptophan. Enzymes releasing N- and C-terminus are summarized separately. This solution may be justified by the fact that peptides may be released by more than one enzyme (N- and C- terminus are not released by the same enzyme). This process can be exemplified by protein digestion in the human gastrointestinal tract [100].

**Figure 4.** Scheme of "find enzyme for peptide release" tab action.

#### **7. Useful Links and Other Tabs**

The BIOPEP-UWM plays the role of a metaserver enabling access to databases and software useful in research concerning peptides and proteins. The linked tools available via the "useful links" tab (Figure 1; Supplementary Figure S1) are divided into categories according to Minkiewicz et al. [101]. These categories are summarized in Table 6.


**Table 6.** Categories of bioinformatic tools available via the "useful links" tab.

Other tabs available from the BIOPEP-UWM main page are as follows: List of publications of our group concerning the BIOPEP-UWM database, brief summary concerning the database ("about BIOPEP-UWM" tab), publications concerning particular parts of the BIOPEP-UWM database recommended to be cited by users, and contact data of database curators.

#### **8. Final Remarks**

This paper presents the current status of the BIOPEP-UWMTM database including changes introduced within the period of 2016–2019. Apart from the addition of new peptides (562 items added since submission of our last publication describing the database of sensory peptides and amino acids [9]), information about the existing ones has been completed (especially chemical codes and database references). We also added several new options that are summarized in the Table 7.


**Table 7.** New options in the BIOPEP-UWM database and modifications of existing ones, not described in the previous publications [8,9].

<sup>1</sup> Application serving for conversion amino acid sequences into SMILES code has been announced in [4].

The content of this publication is not restricted to description of new changes in the database and associated tools during the last three years. We try to provide a complete description including both old and new options.

The next modifications would be aimed at removing the weak points of the database and associated applications. We would like to ask users to submit new peptides (via the current version of the "submit new peptide" tab) and any remarks helpful in improving the bioinformatic tool described in this paper.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/23/ 5978/s1.

**Author Contributions:** P.M., A.I., and M.D. are curators of the BIOPEP-UWM database. P.M., M.D., and A.I. designed new options and applications associated with the BIOPEP-UWM database. P.M., A.I., and M.D. have written the manuscript. Funding acquisition—M.D. and A.I.

**Funding:** The project was financially supported by the Minister of Science and Higher Education in the range of the program entitled "Regional Initiative of Excellence" for the years 2019–2022, Project No. 010/RID/2018/19, amount of funding 12,000,000 PLN and University of Warmia and Mazury, grant number 17.610.014-300.

**Acknowledgments:** Authors thank Krzysztof Sieniawski and Mariusz Falkowski (Enter Krzysztof Sieniawski, Olsztyn, Poland) for IT support; and also Monika Hrynkiewicz, Marta Turło, Agnieszka Skwarek, Monika Pliszka, and Piotr Starowicz, for adding new data to the BIOPEP-UWM database; furthermore Iwona Szerszunowicz and Kamila Licka for pointing out some weak points of the database and associated software; and finally ChemAxon (Budapest, Hungary) for academic license for MarvinSketch program.

**Conflicts of Interest:** The authors declare no conflict of interests.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Systematical Analysis of the Protein Targets of Lactoferricin B and Histatin-5 Using Yeast Proteome Microarrays**

#### **Pramod Shah 1,2, Wei-Sheng Wu <sup>3</sup> and Chien-Sheng Chen 1,2,4,\***


Received: 18 July 2019; Accepted: 23 August 2019; Published: 28 August 2019

**Abstract:** Antimicrobial peptides (AMPs) have potential antifungal activities; however, their intracellular protein targets are poorly reported. Proteome microarray is an effective tool with high-throughput and rapid platform that systematically identifies the protein targets. In this study, we have used yeast proteome microarrays for systematical identification of the yeast protein targets of Lactoferricin B (Lfcin B) and Histatin-5. A total of 140 and 137 protein targets were identified from the triplicate yeast proteome microarray assays for Lfcin B and Histatin-5, respectively. The Gene Ontology (GO) enrichment analysis showed that Lfcin B targeted more enrichment categories than Histatin-5 did in all GO biological processes, molecular functions, and cellular components. This might be one of the reasons that Lfcin B has a lower minimum inhibitory concentration (MIC) than Histatin-5. Moreover, pairwise essential proteins that have lethal effects on yeast were analyzed through synthetic lethality. A total of 11 synthetic lethal pairs were identified within the protein targets of Lfcin B. However, only three synthetic lethal pairs were identified within the protein targets of Histatin-5. The higher number of synthetic lethal pairs identified within the protein targets of Lfcin B might also be the reason for Lfcin B to have lower MIC than Histatin-5. Furthermore, two synthetic lethal pairs were identified between the unique protein targets of Lfcin B and Histatin-5. Both the identified synthetic lethal pairs proteins are part of the Spt-Ada-Gcn5 acetyltransferase (SAGA) protein complex that regulates gene expression via histone modification. Identification of synthetic lethal pairs between Lfcin B and Histatin-5 and their involvement in the same protein complex indicated synergistic combination between Lfcin B and Histatin-5. This hypothesis was experimentally confirmed by growth inhibition assay.

**Keywords:** Lactoferricin B (Lfcin B); Histatin-5; antimicrobial peptides (AMPs); antifungal activity; proteome microarray; synergy

#### **1. Introduction**

Treatment of cancer and bacterial infection, as well as immunocompromised patients, has often resulted in fungal infection. An evolutionarily close relationship with humans has limited the development of antifungal drugs. Moreover, the development of antifungal resistance has worsened the condition, causing major health threats [1,2]. All these conditions have led to the urgent need for alternative antifungal agents.

Antimicrobial peptides (AMPs) are key components of the innate immune system that protects the host from invading microorganisms: bacteria, fungi, and viruses. AMPs are short peptides (mostly cationic peptides) with a wide range of antimicrobial activities as well as an adjunct role in immunomodulation and wound-healing [3,4]. The broad-spectrum activity, selective targeting to microorganisms, highly sensitive, multiple modes of the mechanism of action and little toxic effect on human cells makes AMPs a potent alternative to conventional antibiotics [5,6]. Antifungal AMPs are reported to target different fungal cell components [7,8], mostly cell wall and cell membrane, causing leakage of ions and ATPs [9]. AMPs also exert intracellular activities and generation of reactive oxygen species (ROS) that lead to apoptosis/necrosis and cell death [10,11]. Despite multiple mechanisms of antifungal AMPs, very few targets have been reported.

Lactoferricin B (Lfcin B) is a 25-residue peptide (derived from bovine lactoferrin; residues 17–41) with a net positive charge of +8. Lfcin B has twisted antiparallel β-sheet structure containing hydrophilic and hydrophobic residues on the alternative strand [12]. Lfcin B is also detected in the human gut as a result of bovine milk consumption [13]. Lfcin B has antimicrobial activity against bacteria, fungi, cancer cells, and others. The minimum inhibitory concentration (MIC) of Lfcin B against *Escherichia coli* (*E. coli*) wild type and *Saccharomyces cerevisiae* is 15.6–31.2 μg/mL and 0.68 μg/mL, respectively [14,15]. Lfcin B has been reported to exert intracellular activity against yeast [14,16]; however, the intracellular binding targets are unknown. Moreover, the hydrophobic residue is crucial for the translocation of Lfcin B through the cell membrane [17]. Proteome microarray is a high-throughput detection platform for the identification of proteome interactions [18]. In this study, we have systematically identified all the yeast-binding proteins of Lfcin B by using yeast proteome microarrays [19,20].

In addition to Lfcin B, we also used yeast proteome microarrays to identify the yeast protein targets of another most potent antifungal AMP, Histatin-5. Histatin-5 is a 24-residue peptide derived from Histatin-3 (family: Histatins) present in human saliva. Histatin-5 has a net charge of +5 and exerts its potential mode of action by targeting intracellular molecules [21,22]. Histatin-5 has no defined structure in water but acquires α-helix structure in dimethyl sulfoxide and aqueous trifluoroethanol [23]. Presence of Histatin-5 in the mouth provides initial defense against pathogens and avoid their entry to the human gut [24]. The MIC reported for Histatin-5 against *Saccharomyces cerevisiae* is 128 μg/mL [25]. Despite its potential role, only a few targets of Histatin-5 have been reported [26–30]. Thus, comprehensive identification of Histatin-5 protein targets is needed.

The identified yeast protein targets of Lfcin B and Histatin-5 were subjected to bioinformatics analysis to identify functional enrichment in gene ontology (GO) and analysis of synthetic lethal pairs. Synthetic lethality is the well-studied pairwise genetic mutation combination that occurs between two non-lethal genes' mutation, where their simultaneous loss causes cell death while the individual gene deletion has no impact on cell viability [31]. Synthetic lethality approach has been recently applied in chemotherapy for the treatment of cancer [32,33] but not fully exploited in case of pathogenic infections in the lack of systematical identification of synthetic lethal pairs. Synthetic lethality is well studied in yeast and other organisms (fruit fly, worm, mouse, and human). Of the entire genes identified from whole-genome sequence in *Saccharomyces cerevisiae* (~6000 genes) only ~1000 are essential genes (known by single-gene deletion mutant) whereas the other non-essential genes, when disrupted in any two combinations (synthetic lethality), cause over 170,000 synthetic lethal pairs [34]. These observations provide the essentiality of every gene in an organism [35]. The canonical explanation governing the mechanism of synthetic lethality is based on the deletion of two genes (synthetic lethal partner) in three ways: parallel pathway, same pathway and reversible steps in a pathway. First, the deletion of two genes working on the parallel pathways that perform the same essential function is redundant in nature (mutually compensatory) [31,36]. Second, deletion of two genes working on the same pathway can be explained by three possible phenomena: (1) partial redundancy where the loss of one gene causes only the partial degradation that might be tolerable but not the loss of both genes; (2) internal redundancy of two steps within a pathway; and (3) double defect in essential protein complex formed by many proteins [34,37,38]. Third, deletion of two genes that target the reversible forward and backward steps of the non-essential pathway [38]. Furthermore, the synthetic lethality approach was applied to

determine the effect of pairwise protein essentiality among the identified protein targets of Lfcin B and Histatin-5.

In our previous studies, we have systematically explored the entire *E. coli* protein targets of Lfcin B by used *E. coli* proteome microarrays [39,40]. To analyze the similarity and differences in the pattern of targets of Lfcin B in different species, i.e., yeast and *E. coli*, we have compared the enrichment results of Lfcin B obtained from the protein targets of yeast and the protein targets of *E. coli*.

#### **2. Results**

#### *2.1. Yeast Proteome Microarrays Assay*

To systematically identify the protein targets of Lfcin B and Histatin-5, high-throughput yeast proteome microarrays were employed. Yeast proteome microarray analysis aids in the parallel identification of protein targets of Lfcin B and Histatin-5 from the entire proteome of yeast. The overall schematic diagram of this study is depicted in Figure 1. Biotinylated Lfcin B and Histstin-5 were individually probed on yeast proteome microarrays and then further probed with DyLight 650-labeled streptavidin and DyLight 550-labeled anti-Glutathione-S-transferase (anti-GST) antibody. DyLight 650-labeled streptavidin was used to detect the biotinylated AMPs that bound to the proteins in yeast proteome microarrays. Because individually purified yeast proteins contain GST tags, anti-GST antibody labeled with DyLight 550 was probed to represent their relative protein amounts on the yeast proteome microarray. Each protein on yeast proteome microarray was printed in duplicate and yeast proteome microarrays assay were conducted in triplicate for Lfcin B and Histatin-5, individually.

**Figure 1.** The schematic diagram of this study. Yeast proteome microarrays were fabricated from the ~5800 yeast proteins which were individually expressed and purified from yeast. To identify the protein targets, the fabricated yeast proteome microarrays were probed with biotinylated Lfcin B and Histatin-5, individually. Followed by the probing of DyLight 650 labeled streptavidin and DyLight 550 labeled anti-GST antibody on the yeast proteome microarrays to detect the signal of biotinylated AMPs bound to their specific binding partners and the amount of relative protein on proteome microarray, respectively. The identified hits of Lfcin B and Histatin-5 were systematically analyzed for GO and synthetic lethal pairs by using several online bioinformatics databases (DAVID tools and Synthetic Lethality).

The data obtained from the triplicate yeast proteome microarrays were analyzed together to identify the protein targets of Lfcin B and Histatin-5. After median scaling normalization, hits of Lfcin B and Histatin-5 from yeast proteome microarrays were selected by the local cutoff value that is greater than mean plus two standard deviations and ratio higher than 0.5 of AMP to the anti-GST antibody signal. The total number of protein targets identified for Lfcin B and Histatin-5 from yeast proteome microarrays were 140 and 137, respectively. These hits were all validated by eyeballing their images. The representative images of yeast proteome microarrays assay of Lfcin B (Figure 2A) and the enlarged protein images of representative protein targets of Lfcin B that appeared in the triplicate yeast proteome microarrays assay of Lfcin B are depicted in Figure 2B. Similarly, the representative images of yeast proteome microarrays assay of Histatin-5 on yeast proteome microarrays (Figure 2C) and the enlarged protein images of representative protein targets that appeared in the triplicate microarray assays of Histatin-5 on yeast proteome microarrays are depicted in Figure 2D. The identified protein targets of Lfcin B and Histatin-5 were analyzed for common and unique targets. Common hits are the protein targets present both in Lfcin B and Histatin-5 whereas unique hits are the protein targets present only in Lfcin B or histatin-5. That means the unique protein targets of Lfcin B are only present in Lfcin B and not in Histatin-5. The Venn diagram (Figure 3) shows that Lfcin B and Histatin-5 have common protein targets of 77. The unique protein targets of Lfcin B and Histatin-5 are 63 and 60, respectively. The total unique and common protein targets of Lfcin B and Histatin-5 from yeast proteome microarrays are displayed in Supplementary Table S1.

**Figure 2.** *Cont.*

**Figure 2.** The representative yeast proteome microarray images and the representative hits of Lfcin B and Histatin-5 on yeast proteome microarrays. Images show the microarray and the enlarge hit image of Lfcin B and Histatin-5. Red and green color denotes the signal from DyLight 650 labeled streptavidin and DyLight 550 labeled anti-GST antibody, respectively. Red spots represent the signal of Lfcin B and Histatin-5 on yeast proteome microarrays. (**A**) Triplicate microarray images probed with biotinylated Lfcin B. The position of the representative 5 hits image is shown on the microarray. (**B**) Representative enlarged hits images of Lfcin B. (**C**) Triplicate microarray images probed with biotinylated Histatin-5. (**D**) Representative enlarged hits images of Histatin-5.

**Figure 3.** Unique and common hits of Lfcin B and Histatin-5 identified from yeast proteome microarrays. The protein targets of Lfcin B and Histatin-5 identified from yeast proteome microarrays are 140 and 137, respectively. These identified hits of Lfcin B and Histatin-5 were categorized as unique and common hits. Unique protein targets that are present only in Lfcin B are 63 hits whereas the unique protein targets that are present only in Histatin-5 are 60 hits. 77 protein targets were common to both Lfcin B and Histatin-5.

#### *2.2. Enrichment Analysis in GO Biological Process for the Protein Hits of Lfcin B and Histatin-5*

To know the over-representation proteins with similar function among the protein targets of Lfcin B and Histatin-5, GO enrichment analysis was performed. Using Database for Annotation, Visualization and Integrated Discovery (DAVID) online database [41], we obtained GO enrichment results in biological processes for the protein hits of Lfcin B and Histatin-5. The results display significant over-representation for the protein targets of Lfcin B and Histatin-5 (*p*-value cutoff of 0.05) in several biological processes (Figure 4). Interestingly, Lfcin B showed enrichment in most of the displayed categories; this depicts the involvement of Lfcin B to a broader targets range than Histatin-5. This might be the reason that Lfcin B has a lower MIC than Histatin-5 does. Lfcin B depicted the most obvious enrichment in "macromolecular complex subunit organization" that regulates macromolecule aggregation or disaggregation to form or alter protein complexes. Lfcin B showed enrichment in several unique categories that are present only in Lfcin B and not in Histatin-5. They are several negative regulation processes that stop or reduce several cellular process, such as: "negative regulation of cellular process", "negative regulation of cellular metabolic process", "negative regulation of metabolic process", and "negative regulation of macromolecule metabolic process" as well as regulation of several unique processes that modulates chemical reactions and pathways involved in normal metabolic processes such as "regulation of primary metabolic process", "regulation of cellular metabolic process", "regulation of macromolecule metabolic process", "regulation of catalytic activity", and "regulation of cellular component size" (Figure 4). These results indicate that Lfcin B targets proteins that manipulate the transcriptional response. Moreover, Lfcin B showed unique enrichment in the organization of cytoskeletal structures comprised of actin filaments, such as "actin filament organization", "regulation of actin filament-based process", and "actin filament-based process" as well as "cellular component assembly". This over-representation of Lfcin B protein targets shows the possible mechanism of Lfcin B entry inside the yeast and hamper their cytoskeleton. On the other hand, the unique over-representation for the protein targets of Histatin-5 was only observed in two biological processes, i.e., "cellular catabolic process" and "translational initiation". These two categories are related to the breakdown of substances and formation of an initial translational complex of the ribosome, mRNA, respectively.

Common enrichment for the protein targets of Lfcin B and Histatin-5 shows the conserved targets of these two AMPs on yeast. "Ribonucleoprotein complex biogenesis" is the well-conserved protein targets of Lfcin B and Histatin-5 that are involved in RNA-protein complex formation. Several compound metabolic processes were also conserved such as "cellular aromatic compound metabolic process", "organic cyclic compound metabolic process", "nucleobase-containing compound metabolic process", and "heterocycle metabolic process". "Negative regulation of molecular function" and "cellular component disassembly" is also common enrichment of Lfcin B and Histatin-5 (Figure 4).

**Figure 4.** Enrichment in biological process of Lfcin B and Histatin-5 hits obtained from yeast proteome microarrays. The significant enrichment categories in biological process of Lfcin B and Histatin-5 of yeast-binding proteins (dotted line indicates the *p*-value of 0.05).

#### *2.3. Enrichment Analysis in GO Molecular Function and Cellular Component*

In addition to GO enrichment in the biological process, we also analyzed GO enrichment in molecular function and cellular component for the protein targets of Lfcin B and Histatin-5 (Figure 5). The analysis of over-representation for the protein targets of Lfcin B in GO molecular function (Figure 5A) showed unique enrichment in "histone binding" (related to DNA binding protein), "basal transcription machinery binding" (related to proteins of basal transcription factors and RNA polymerase core enzyme) indicating the involvement of Lfcin B in several proteins related to DNA and RNA binding that regulate gene expression in yeast. Protein targets of Lfcin B also showed unique over-representation in "protein complex binding" and "ubiquitin-like protein transferase activity". On the other hand, Histatin-5 showed no unique enrichment. However, common enrichment for Lfcin B and Histatin-5 were observed in "ribonucleoprotein complex binding" and "enzyme binding". The enrichment in "ribonucleoprotein complex binding" is similar to the enrichment observed in biological process which indicates the highly conserved targets of Lfcin B and Histatin-5. These results also indicate that the target protein functions of Histatin-5 are also the target protein functions of Lfcin B, and yet Lfcin B targets additional protein functions (Figure 5A).

Figure 5B depicts the GO enrichment in cellular component for the protein targets of Lfcin B and Histatin-5. Unique enrichment of Lfcin B is observed in "cell projection part", "mating projection", "mating projection tip", "cell cortex", "site of polarized growth", and "eisosome filament"—these are related to cell and mating projection as well as several signaling pathways which manipulate the development and communication. Lfcin B specifically showed unique enrichment in "Dom34-Hbs1 complex" that indicates its effect on cotranslational mRNA quality control. On the other hand, Histatin-5 showed no unique enrichment. These results also depicted the wider range of over-representation of protein targets in Lfcin B than Histatin-5. In common enrichment, the enriched categories belong to "intracellular non-membrane-bounded organelle", "intracellular ribonucleoprotein complex", "organelle lumen", and "intracellular organelle lumen". These results showed conserved enrichment of Lfcin B and Histatin-5 in targeting yeast.

**Figure 5.** Functional enrichment of Lfcin B and Histatin-5 hits obtained from yeast proteome microarrays. The significant enrichment categories of Lfcin B and Histatin-5 of yeast-binding proteins. (**A**) Enrichment in molecular function (**B**) Enrichment in cellular component (dotted line indicates the *p*-value of 0.05).

#### *2.4. Comparison of Lfcin B Protein Targets of Yeast and E. coli*

To analyze the target pattern of Lfcin B in yeast and *E. coli*, we compared the enrichment results of protein targets of Lfcin B obtained from yeast proteome microarrays (in this study) and *E. coli* proteome microarrays (our previous study) [39]. Figure 6 shows the comparison results of GO enrichment in biological process (Figure 6A), molecular function (Figure 6B) and cellular component (Figure 6C) of the protein targets of Lfcin B in yeast and *E. coli*. In biological process, the protein targets of Lfcin B from yeast and *E. coli* showed only 7 common enrichment categories whereas 15 and 14 unique enrichment categories, respectively (Figure 6A). Several unique enrichments result of Lfcin B for yeast and *E. coli* indicated that Lfcin B exerted different target patterns in yeast and *E. coli*. These results also mean that the mechanism by which Lfcin B inhibits yeast is different from *E. coli*. Some targets of Lfcin B are conserved between yeast and *E. coli* and showed common enrichment that were mostly related to metabolic processes, such as: "heterocycle metabolic process", "cellular aromatic compound metabolic process", "organic cyclic compound metabolic process", "regulation of primary metabolic process", "nucleobase-containing compound metabolic process", "regulation of cellular metabolic process", and "regulation of macromolecular metabolic process".

**Figure 6.** Functional enrichment of Lfcin B hits obtained from *E. coli* proteome microarray and yeast proteome microarrays. The comparison of enrichment categories of Lfcin B target proteins from yeast and *E. coli* proteomes. (**A**) Enrichment in biological process (**B**) Enrichment in molecular function and (**C**) Enrichment in cellular component (dotted line indicates the *p*-value of 0.05).

The difference in the mechanism of actions of Lfcin B in yeast and *E. coli* were also obtained by comparing enrichment results of molecular function and cellular component. The result depicted in Figure 6B shows the enrichment categories in molecular function. Lfcin B showed single unique enrichment in "nucleic acid binding" for the protein targets of *E. coli* whereas for protein targets of Lfcin B from yeast showed significant enrichment in several functions related mostly to protein-binding, such as "enzyme binding", "protein complex binding", "ribonucleoprotein complex binding", "basal transcription machinery binding", "histone binding" and "ubiquitin-like protein transferase activity". It is very interesting to observe that Lfcin B targets most yeast proteins with protein-binding function and most *E. coli* proteins with nucleic acid binding function. In case of cellular component (Figure 6C), *E. coli* protein targets of Lfcin B showed enrichment in the cytoplasm and intracellular whereas yeast protein targets of Lfcin B were enriched in several categories related to cell, ribonucleoprotein, organelle lumen, and mating projection, as well as several signaling pathways.

The above enrichment comparison results of protein targets of Lfcin B from yeast and *E. coli* showed several functional enrichments in yeast than that of *E. coli*, indicating the wider targets of Lfcin B in the case of yeast than *E. coli*. Thus, Lfcin B causes higher effect against yeast than that of *E. coli*.

#### *2.5. Identification of Synthetic Lethal Pairs Targeted by Lfcin B and Histatin-5*

The identification of synthetic lethal pairs is critical for deciphering the mechanism of action as they exert lethal effects on growth. Synthetic lethality database [42] was used to identify the synthetic lethal pair for the protein targets of Lfcin B and Histatin-5. Within the protein targets of Lfcin B, we identified 11 synthetic lethal pairs. All the identified synthetic lethal pairs within the protein targets of Lfcin B are shown in Table 1. Among these 11 synthetic lethal pairs of Lfcin B, one paralog pair (SIS2–VHS3) was also identified. Within the protein targets of Histatin-5, we identified a total of three synthetic lethal pairs. Synthetic lethal pairs cause lethal effect on yeast growth, and the higher the number of identified synthetic lethal pairs, the higher is the lethality impact on growth. A total of 11 synthetic lethal pairs identified within the yeast protein targets of Lfcin B would cause intense lethality effect on yeast than the three synthetic lethal pairs identified within the protein targets of Histatin-5. This might also be the potential reason for the previously observed lower MIC of Lfcin B against *Saccharomyces cerevisiae* than that of Histatin-5.


**Table 1.** Synthetic lethal pairs within the total protein targets of Lfcin B and Histatin-5. \* denotes paralog pairs among the identified synthetic lethal pairs.

We further analyzed the synthetic lethal pair between the unique protein targets of Lfcin B and Histatin-5 by using synthetic lethality database. Two synthetic lethal pairs were identified between the unique protein targets of Lfcin B and Histatin-5. The identified synthetic lethal pairs and the function of individual proteins are depicted in Table 2. Also, the individual protein images and their signals from triplicate yeast proteome microarrays were validated by eyeballing and the enlarged protein images are illustrated in Figure 7. The first synthetic lethal pair (SPT8 and HFI1) identified between the unique protein targets of Lfcin B and Histatin-5 are the subunits of the Spt-Ada-Gcn5 acetyltransferase (SAGA) complex. SAGA complex is involved in histone modification that characterizes histone acetyltransferase and histone deubiquitinase [43]. The synthetic lethality caused by Lfcin B and Histatin-5 treatment will hamper the structural integrity as well as the histone modification function of SAGA complex. Moreover, SPT8 and HFI1 subunits of SAGA complex are reported to bind with TATA-binding protein (TBP) and function in delivering TBP to TATA box [44]. Thus, this lethal pair will also effect initiation of transcription processes.

**Table 2.** Synthetic lethal pairs between the unique hits of Lfcin B and Histatin-5 along with the functions of individual proteins involved in these synthetic lethal pairs.


**Figure 7.** Synthetic lethality and enlarge protein images of SPT8, RAD6, and HFI1 on yeast proteome microarray. The enlarged protein images of two synthetic lethality pairs identified between the unique hits of Lfcin B and Histatin-5 as well as the synthetic lethality pair within the protein targets of Lfcin B.

The second synthetic lethal pair (RAD6 and HFI1) identified between Lfcin B and Histatin-5 also targets the histone modification function. RAD6 (ubiquitin associated enzyme E2), together with BRE1 (ubiquitin enzyme E3) is known to modify histone H2B-lysine at position 123 with ubiquitin [45]. This monoubiquitination of histone H2B is not for H2B degradation, rather the ubiquitin in H2B

(H2B~Ub) server as a signal for various aspects of gene expression, such as initiation and elongation of transcription as well as DNA replication and repair [46]. Ubiquitination in H2B-lysine 123 is reversed by SAGA complex that functions as deubiquitination of H2B~Ub and causes acetylation of histone [47]. H2B~Ub up-regulate histone H3-lysine 4 methylation and down-regulate histone H3-lysine 36 methylation, whereas SAGA deubiquitination of H2B acts in the opposite way and reduce H3-lysine 4 methylation and increase H3-lysine 36 methylation levels. Together, ubiquitination and deubiquitination are involved in transcriptional activation [48]. Lfcin B targeting RAD6 would hamper histone H2B ubiquitination whereas Histatin-5 targeting HFI1 would disrupt SAGA complex and cause functional defect (deubiquitination and acetylation) of SAGA complex to modify histone H2B. Moreover, SPT8 and RAD6 are both the protein targets of Lfcin B and were also reported to have synthetic lethality (Table 1). The mechanism of synthetic lethality is similar to RAD6 and HFI1 (as explained above).

It is known that synthetic lethal pairs identified during treatment with two drugs may have a synergistic effect [49–51], as the two drugs targeting the individual protein of synthetic lethal pair cause extra synthetic lethality. The two synthetic lethal pairs identified between the unique protein targets of Lfcin B and Histatin-5 ensure additional synthetic lethal effects on yeast growth that are not observed with individual treatment of Lfcin B or Histatin-5. Moreover, both these identified synthetic lethal pairs are involved in the structure and function of SAGA protein complex that regulates gene expression by modifying histone. Additional synthetic lethal pairs, as well as the same protein complex target, will cause a significantly higher lethality effect on yeast by the combined treatment of Lfcin B and Histatin-5. Thus, we hypothesized synergistic combination between Lfcin B and Histatin-5 on yeast.

#### *2.6. Validation of Synergistic Combination between Lfcin B and Histatin-5*

Based on two pairs of synthetic lethality interactions targeted by unique hits of Lfcin B and Histatin-5, we predicted a synergistic effect between Lfcin B and Histatin-5. This prediction was experimentally validated by the growth inhibition curve in the presence of individuals and a combination of Lfcin B and Histatin-5 (Figure 8). A combination of Lfcin B and Histatin-5 clearly showed drastic inhibition effects on yeast growth curves compared with individual AMPs. To show the result obtained is synergistic, we calculated the expected combinational value of Lfcin B and Histatin-5 from the anticipated optical density (OD) values of the inhibitory effect of the individual Lfcin B and Histatin-5 on yeast growth curves in sequence (i.e., OD of yeast without AMP multiplied by the percentage of the remaining yeast after the treatment of Lfcin B and further multiplied by the percentage of the remaining yeast after the treatment of the Histatin-5). A significant difference was observed between the expected combination value and experimental combination value of Lfcin B and Histatin-5. This result demonstrated the synergistic combination of Lfcin B and Histatin-5 against yeast growth.

**Figure 8.** Growth inhibition effect of individual and combination of Lfcin B and Histatin-5 on yeast. Yeast was grown without AMPs and in the presence of individual and combination of Lfcin B (15 μg/mL) and Histatin-5 (20 μg/mL). Significant inhibition in growth was observed with the combination of Lfcin B and Histstin-5. The combinational inhibition of Lfcin B and Histatin was significantly lower than the expected value of Lfcin B and Histatin-5 combination calculated from the individual inhibition of Lfcin B and Histatin-5, concluding the synergy combination between Lfcin B and Histatin-5.

#### **3. Discussion**

Lfcin B and Histatin-5 are potent AMPs with antifungal activities [16,52]. Given the lack of entire targets knowledge, the potential mechanisms of antifungal AMPs are not fully explored. In this study, we have systematically identified the entire yeast protein targets of Lfcin B and Histatin-5 by using yeast proteome microarrays. A total of 140 and 137 protein targets of Lfcin B and Histatin-5 were identified from yeast proteome microarrays, respectively. Lfcin B and Histatin-5 can penetrate the yeast cell envelope so we assume that the bioavailability of Lfcin B and Histatin-5 on the yeast proteome microarrays will be similar to yeast in in vivo.

Earlier studies with *Candida albicans* have shown the binding of Histatin-5 to the cell surface proteins SSA1 and SSA2 [53,54]. In our yeast proteome microarray, both SSA1 and SSA2 were absent, thus we did not identify them in our hit list. Histatin-5 is reported to target mitochondria [26,55] and generate and release reactive oxygen species (ROS) that causes cell death [52]. Thus, we looked for the protein targets of Histatin-5 that are involved in mitochondria. We identified seven proteins: AIM21, AIM26, DOA1, YMR26, ETR1, ATP5, and RML2, which are involved in several mitochondrial functions, and six of them (not AIM21) are localized inside the mitochondria. AIM21 is a cytoplasmic protein that helps mitochondrial migration along actin filaments. This result depicted that our finding not only supported the previous finding but also provided the complete targets of Histatin-5 in mitochondria.

Ergosterol is the major fungal membrane sterol essential for fungal cell viability and is absent in humans [56]. Most of the antifungal drugs on the market target ergosterol biosynthesis enzymes [57]. Among the hits of Lfcin B, Lanosterol synthase (ERG7) belonging to ergosterol biosynthesis was identified. Molsidomine, a drug used as a vasodilator for the treatment of angina, was reported to target ERG7 and showed potential antifungal activity [58]. However, no antifungal drug targeting ERG7 is commercially available on the market.

Furthermore, two protein targets belonging to the ergosterol pathway were identified individually in Lfcin B and Histatin-5 by lowering the hit identification cutoff value from 2SD to 1SD. Lfcin B targeted ERG1 and ERG7 whereas Histatin-5 targeted ERG1 and ERG12. These protein targets were individually validated by eyeballing and the enlarged images of these proteins shown in Supplementary Figure S1. ERG1, the common target of Lfcin B and Histatin-5, is the first enzyme in ergosterol biosynthesis and

a well-known target of antifungal drugs class, Allylamines (naftifine and terbinafine) [59]. ERG7 is the second enzyme in the ergosterol biosynthesis after ERG1. ERG1, ERG7, and ERG12 are essential genes in the biosynthesis of ergosterol and their individual deletion is lethal for yeast. Deletion of Erg7 results in accumulation of ERG9, which hampers squalene production and amounts to ergosterol biosynthesis. String analysis showed protein–protein interaction between ERG1, ERG7, and ERG12 (data not shown). ERG7 and ERG12 have not been exploited as the target; thus, Lfcin B and Histatin-5 can be a potential target to cope with emerging antifungal drug resistance fungi.

Synthetic lethality approach has recently gained attention for its potential to understand and design medication for cancer [32,33]. Synthetic lethality describes a useful pairwise interaction, where the simultaneous deletion of both the component of the pair cause growth defects but the deletion of an individual component has no effect on growth [31]. We have used a synthetic lethality approach to identify synthetic lethal pairs within the protein targets of Lfcin B and Histain-5. The number of synthetic lethal pairs identified within the protein targets of Lfcin B and Histatin-5 might demonstrate the potential effect it exerts on yeast. Within the protein targets, 11 synthetic lethal pairs in Lfcin B and three synthetic lethal pairs in Histatin-5 were identified. These results showed that Lfcin B exerts a more lethal effect than Histain-5. Interestingly, it is reported that the MIC of Lfcin B is lower than Histatin-5. Thus, our analysis provided the mechanism of lower MIC of Lfcin B against yeast than Histatin-5, based on the number of identified synthetic lethal pairs. Our analysis also identified two synthetic lethal pairs between the unique protein targets of Lfcin B and Histatin-5. Apart from the individual synthetic lethal pairs of Lfcin B and Histatin-5, the combination of Lfcin B and Histatin-5 targeted two additional synthetic lethal pairs. These additional synthetic lethal pairs are the subunits of SAGA protein complex and are also involved in similar functions. The treatment of yeast with Lfcin B and Histatin-5 might cause a greater inhibition effect as they will shut down the structure and function of SAGA complex. Together, Lfcin B and Histatin-5 might exert synergistic combinations. We performed in vivo growth inhibition assays to test our synergistic combination hypothesis between Lfcin B and Histatin-5. The significantly higher inhibition results in the combined treatment of Lfcin B and Histain-5 (experimental data) than the expected combinational value confirmed the synergistic combination between Lfcin B and Histatin-5.

In this study, we have explored the entire biological targets of Lfcin B and Histatin-5 using yeast proteome microarrays. The identified protein hits were analyzed to observe over-representation in different GO enrichment categories. The results showed wider GO enrichment results for Lfcin B than Histatin-5 in all the three categories of the biological process, molecular function, and cellular component. Moreover, 11 synthetic lethal pairs were identified within the protein targets of Lfcin B whereas only three synthetic lethal pairs were identified within the protein targets of Histain-5. Both these results proved the higher lethal effect of Lfcin B on yeast than Histatin-5. This might result in lower MIC of Lfcin B than Histatin-5 which is in accordance with the previously reported results. Two synthetic lethal pairs were identified between the unique protein targets of Lfcin B than Histatin-5. Thus, we hypothesized a synergistic combination between Lfcin B and Histatin-5. Based on the hypothesis, we designed an inhibition assay to test it and successfully validated our hypothesis. In the future, we will further explore the mechanism of actions of other AMPs with antifungal activities.

#### **4. Materials and Methods**

#### *4.1. Expression and Purification of the Entire Yeast Proteome*

Yeast entire proteome was expressed and purified using the previously reported protocol [19,20]. Briefly, yeast library consists of ~5800 open reading frames, individually cloned in high-copy URA3 expression vector with Glutathione-S-transferase–poly-histidine (GST–HisX6) tag. These clones use galactose-inducible GAL1 promoter to produce GST-HisX6 fusion proteins. Yeast clones were stored in 96-well format at −80 ◦C. For protein expression, yeast clones were first grown on SC-URA3-glucose agar plate at 30 ◦C for 48 h. Colonies were transferred to SC-URA3-glucose medium in 96-deep well

plates and incubated at 30 ◦C for 24 h. Yeast clones were sub-cultured in SC-URA3-raffinose medium in 12 channel reservoirs (a single 96-well plate requires eight 12-channel reservoir plates; in total, the yeast library contains 66 plates of 96 well plates) and incubated at 30 ◦C for 14–16 h (till Optical Density at 600 nm reached to 0.6–1.0). For protein expression, 2% galactose was added to the cultures in SC-URA3-raffinose medium and further incubated for 4 h. Cells in 12-channel reservoirs were centrifuged at 4000 rpm for 2 min and cells were re-suspended in 800 μL cold water. Each strain in 12-channel reservoir was pooled together in each well of 96-deep well plates. Cells were harvested by centrifuging at 4000 rpm for 2 min and stored at −80 ◦C ahead of protein purification.

For protein purification, yeast colonies were ruptured, and GST tag proteins were purified by using glutathione (GSH) beads, following standard protocol. Briefly, 200 μL of 0.7 mm zirconia beads (Biospec Products, Inc., Bartles- ville, OK, USA) were loaded in each well of 96-deep well plate with cell pellets. Freshly prepared 400 μL of lysis buffer with protease inhibitors (1 mM phenylmethylsulfonyl fluoride (PMSF), 50 μM calpain Inhibitor I (LLnL), 1 μM MG132 and 50 × dilution of Roche protease inhibitor (Roche Molecular Biochemicals, Basel, Switzerland)) were added in each well and cell were thaw at 4 ◦C for 15 min. If not stated otherwise, the chemicals used here were purchased from Sigma-Aldrich (Saint Louis, MO, USA) Again, 400 μL of freshly prepared lysis buffer with protease inhibitors was added and the 96-deep well plates were vortexed at 4 ◦C for 30 min. After centrifugation at 4000 rpm for 15 min, the supernatant was transferred to new 96-deep well plates. Pre-washed 100 μL of GSH Sepharose 4B beads (GE Healthcare, Chicago, IL, USA) was added to each well and plates were sealed tight using 96-well cap mats (Thermo Fisher Scientific, Waltham, MA, USA). The plates were placed vertically on the shaker and shaken gently (80 rpm) at 4 ◦C for 80 min. Homogeneous mixtures in 96-deep well plates were transferred to 96-well filter plates (Thermo Fisher Scientific, Waltham, MA, USA) with a filter pore size of 20 μm. The contents were washed with wash buffer I and II and gentle spin dry (1000 rpm for 1 min) to remove extra wash buffer. The bottoms of filter plates were sealed and 50 μL of elution buffer containing reduced GSH was added in each well. 96-well filter plates were shaken vigorously at 4 ◦C for 1 h. The elutes (proteins) were collected in 96-well receiver plates by centrifugation at 4000 rpm for 2 min. To determine the purity and concentration of purified proteins, proteins were randomly selected for SDS-PAGE and Coomassie blue staining was used afterward.

#### *4.2. Fabrication of Yeast Proteome Microarrays*

Previously reported protocol was applied to fabricated yeast proteome microarrays [20]. Briefly, the entire purified yeast proteins in 96-well format were transferred to 384-well format by using Liquidator 96 manual pipetting system (Mettler-Toledo Rainin, LLC Oakland, CA, USA). Before printing, the optimal concentrations of landmarks assisted to align blocks on the microarray were determined and tested. In a cold room, the individual proteins and landmarks were printed in duplicate on aldehyde-coated glass slides by using CapitalBio SmartArrayer™ 136 (CapitalBio Corporation, Beijing, China). CapitalBio SmartArrayer is a high-throughput microarray spotter with 48 pins and print 48 proteins at the same time. After printing, the chips were left in the cold room for overnight and finally stored at −80 ◦C. To monitor the shape, size, and uniformity for each protein spot on a chip, DyLight 550 conjugated anti-GST monoclonal antibody (Rockland Immunochemicals, Gilbertsville, PA, USA) was probed, washed and scanned with LuxScanTM (10K Microarray Scanner; CapitalBio Corporation, Beijing, China).

#### *4.3. Yeast Proteome Chip Assays with Lfcin B and Histatin-5*

N-terminal biotin labeled (biotinylated) AMPs (Lfcin B and Histatin-5) were purchased from Kelowna International Scientific Inc. (Taipei, Taiwan). AMPs were aliquot and stored at −80 ◦C. Below are the amino acid sequences of Lfcin B and Histatin-5 used in this study.

Lfcin B (25 residues): H2N–FKCRRWQWRMKKLGAPSITCVRRAF–COOH Histatin-5 (24 residues): H2N–DSHAKRHHGYKRKFHEKHHSHRGY–COOH

Yeast proteome microarray was first blocked with 3% bovine serum albumin (BSA; Sigma-Aldrich, Saint Louis, MO, USA) in 1X PBS for 1 h at room temperature (RT) with shaking (50 rpm). Chips were washed once with 300 mL PBS-T (0.05% Tween 20) at RT with shaking (50 rpm) for 5 min. Biotinylated AMP (5 μM) was diluted in 1% BSA in 1X PBS and probed individually on yeast proteome microarray with incubation for 1 h at RT with shaking (50 rpm). Chips were washed once with 300 mL PBS-T at RT with shaking (50 rpm) for 5 min. To detect signal from GST tag in yeast proteins and biotinylated AMP bound to yeast proteins, DyLight™ 550 labeled anti-GST antibody (Abcam, Cambridge, UK) and DyLight™ 650 labeled streptavidin (Thermo Fisher Scientific, Waltham, MA, USA) were probed on yeast proteome microarray and incubated with shaking (50 rpm) at RT for 1 h. Finally, the chips were washed for 3 times with 300 mL TBS-T at RT with shaking (50 rpm) for 5 min each. The chips were dried using centrifuge at 1000 rpm for 1 min and scanned with LuxScan.

The chip scan files (TIF format) were opened with GenePix Pro 6.0 software (Axon Instruments, Foster City, CA, USA) and each protein spots on yeast proteome microarray were aligned with their protein names. Binding signals of protein spots on yeast proteome chips were exported in GPR files and later opened with excel files to analyze the results. Median scaling normalization was applied to normalize the signals of Lfcin B, Histatin-5, and anti-GST antibody. After the normalization, the relative binding ability of each AMP to each protein was estimated by the ratio of the fluorescence intensity of each AMP to the anti-GST antibody. The hits were defined as positive only if they met two cutoffs below. First, the signal is higher than the local cutoff, which was defined as one standard deviation (SD) above the signal mean for each spot. Second, the fold change of each AMP signal to anti-GST antibody should be higher than 0.5. Thus, the protein hit list was generated according to the relative binding ability.

#### *4.4. Bioinformatics Analysis of Gene Ontology*

The *Saccharomyces* Genome Database (SGD) (https://www.yeastgenome.org/) and Universal Protein Resource (UniProt) database (https://www.uniprot.org/) provide comprehensive biological information on particular species [60,61]. SGD provides genome-wide information only on *Saccharomyces cerevisiae* whereas UniProt covers a large group of organisms from prokaryotes to eukaryotic, including *Saccharomyces cerevisiae*. These databases were used for the identification of recommended name (Standard name) from ordered locus names (Systematic name) as well as a brief description of proteins and related function and location inside the cell.

The DAVID (https://david.ncifcrf.gov/) [41] is an online database that provides several tools such as GO, Pfam domain analysis and other enrichment analysis. DAVID was used for comprehensive analysis of functional annotations of positive hits of Lfcin B and Histatin-5, individually. This investigation was necessary to understand the biological meaning of the identified yeast proteins targets of Lfcin B and Histatin-5. Here, we used the GO terms to identify biological processes, cellular component and molecular function, as well as the statistical analysis, was performed via *p*-value to determine the significant enrichment in each category.

#### *4.5. Bioinformatics Analysis of Synthetic Lethality Pairs*

Synthetic Lethality database (http://histone.sce.ntu.edu.sg/SynLethDB/index.php) (SynLethDB) is the open-source database of synthetic lethality gene pairs from different source organisms [42]. SynLethDB was used for the comprehensive identification of synthetic lethality pairs between the protein hits of Lfcin B and Histatin-5.

#### *4.6. Growth Inhibition Assay on Yeast*

Yeast (*Saccharomyces cerevisiae* Y258) was incubated with or without AMPs to observe individual and combination inhibition effect of AMP. Yeast was grown on yeast extract peptone dextrose (YPD) agar plate for 48 h at 30 ◦C. A single colony was transferred into YPD liquid media and incubated with shaking for 24 h at 30 ◦C. Optical density at 600 nm (OD600 nm) was detected for culture grown for

24 h and the culture was diluted to approximately 0.001 in YPD medium. The diluted culture of yeast was added in NuncTM F 96 well plate (Nalge Nunc International, Rochester, NY, USA) containing an individual and combined number of AMPs in specific wells. The working concentration of Lfcin B and Histatin-5 were 15 μg/mL and 20 μg/mL, respectively. The 96-well plate was incubated at 30 ◦C in an automated Synergy 2 Multi-Mode Microplate Reader (BioTek Instruments Inc., winooski, VT, USA) and the growth was monitored on a regular interval of 20 min (Shaking for 15 s prior to reading) at OD600 nm. Data were collected automatically using Gen5™ reader control and data analysis software (BioTek Instruments Inc., winooski, VT, USA). To show graphic representation of growth inhibition assay, the data were plotted using Sigmaplot. Expected value at given interval (EXY) = OD0 \*(ODX/OD0) \*(ODY/OD0) [62]. Where OD0 is the optical density of yeast in absence of AMP, ODX is the optical density of yeast in the presence of Lfcin B and ODY is the optical density of yeast in the presence of Histatin-5 at a given interval of time.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1422-0067/20/17/4218/s1, Figure S1: Image of hits belonging to Ergosterol biosynthesis; Lfcin B (ERG7 & ERG1) and Histatin-5 (ERG 12 & ERG1). Table S1: Two standard deviation (2SD) hits of Lfcin B and Histatin-5 from yeast proteome microarrays.

**Author Contributions:** C.-S.C. developed the idea and supervised the project. P.S. conducted the experiments. P.S. and W.-S.W. analyzed the data. P.S. and C.-S.C. wrote the manuscript.

**Funding:** This research was financially supported by Ministry of Science and Technology Taiwan, MOST 104-2320-B-008-002-MY3.

**Acknowledgments:** Authors thank Jagat Rathod for his valuable contribution in editing this manuscript.

**Conflicts of Interest:** The authors declare no competing interests.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **Prediction of Bioactive Peptides from** *Chlorella sorokiniana* **Proteins Using Proteomic Techniques in Combination with Bioinformatics Analyses**

**Lhumen A. Tejano 1, Jose P. Peralta 1, Encarnacion Emilia S. Yap 1, Fenny Crista A. Panjaitan <sup>2</sup> and Yu-Wei Chang 2,\***


Received: 20 February 2019; Accepted: 9 April 2019; Published: 11 April 2019

**Abstract:** *Chlorella* is one of the most nutritionally important microalgae with high protein content and can be a good source of potential bioactive peptides. In the current study, isolated proteins from *Chlorella sorokiniana* were subjected to in silico analysis to predict potential peptides with biological activities. Molecular characteristics of proteins were analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and proteomics techniques. A total of eight proteins were identified by proteomics techniques from 10 protein bands of the SDS-PAGE. The predictive result by BIOPEP's profile of bioactive peptides tools suggested that proteins of *C. sorokiniana* have the highest number of dipeptidyl peptidase-IV (DPP IV) inhibitors, with high occurrence of other bioactive peptides such as angiotensin-I converting enzyme (ACE) inhibitor, glucose uptake stimulant, antioxidant, regulating, anti-amnestic and antithrombotic peptides. In silico analysis of enzymatic hydrolysis revealed that pepsin (pH > 2), bromelain and papain were proteases that can release relatively larger quantity of bioactive peptides. In addition, combinations of different enzymes in hydrolysis were observed to dispense higher numbers of bioactive peptides from proteins compared to using individual proteases. Results suggest the potential of protein isolated from *C. sorokiniana* could be a source of high value products with pharmaceutical and nutraceutical application potential.

**Keywords:** *Chlorella sorokiniana*; in silico; BIOPEP-UWM database; proteomics; bioactive peptides; nano liquid chromatography tandem mass spectrometry (nanoLC–nanoESI MS/MS)

#### **1. Introduction**

Microalgae are eukaryotic unicellular organisms that grow easily with inexpensive substrates. Therefore, they are considered to be economical and effective raw materials in industry [1]. Many studies have been conducted to utilize microalgae as useful products. Most of them majorly focused on the potential of microalgae for biofuel production due to their lipid content and abundant availability [1,2]. However, due to the increasing population and demand for protein, there is a call to further utilize microalgae as protein sources to shift away from animal proteins.

Microalgae have been known as one of many promising alternative plants for proteins as they offer up to 50% (*w/w*) of protein [3] with a well-balanced amino acid profile required for the nutrition of human beings [1,4]. Several studies have reported the biological activities of various microalgae protein hydrolysates, including immunostimulant and antitumor activities from *C. sorokiniana* [5,6], angiotensin-I converting enzyme (ACE) inhibitory and hypotensive activities from *Chlorella vulgaris* [7,8] and *Nannochloropsis oculata* [9], antioxidant effects from *Navicula incerta* [10] and *Chlorella ellipsoidea* [11], anti-inflammatory effects from *Spirulina maxima* [12] and antibacterial property from *Spirulina platensis* [13]. From *Chlorella sorokiniana*, Lin et al. [14] were able to purify and identify four active peptides with high ACE inhibitory effects. A variety of other compounds were also detected from microalgae which show certain usefulness for human and animals. Morgese et al. [15] previously reported the beneficial effect of omega (ω)-3 and ω-6 polyunsaturated fatty acids (PUFAs) from *Chlorella sorokiniana* on the emotional, cognitive and social behavior in rats. Talero et al. [16] also reviewed bioactive compounds of microalgae that give the chemopreventive effect on chronic inflammation and cancer. Corresponding to those findings, the exploration of biological activities from microalgae has gained significant attention with regard to its health-promoting properties related to bioactive compounds. Therefore, further observations are still needed to identify more bioactive peptides from *C. sorokiniana*, including dipeptidyl peptidase-IV (DPP IV) inhibitory peptides, ACE-inhibitory peptides and antioxidant peptides.

With the advancements in protein analysis, techniques for protein identification and predictive analysis of potential bioactive peptides have been established. Proteomics techniques have been widely used to analyze proteins presented in protein sample [17]. Moreover, mass spectrometry (MS)-based proteomics have been successfully applied in researches to identify protein from complex materials, including protein characterization of chickpea and oat seeds [18], fish authentication [19], species identification of spoilage and pathogenic bacteria [20], and protein characterization of tilapia processing co-products [21]. Once the sequences of protein are obtained, a bioinformatics tool such as the BIOPEP-UWM database is used to predict bioactive peptides composed in protein sequences [22,23]. Bioinformatics, also known as in silico technique, is a computational method used to estimate bioactive peptides from the known protein sequences [24]. It also allows performing the simulation of enzymatic hydrolysis using proteases to predict bioactive peptides theoretically released from the intact protein sequences [25–27].

The application of proteomics coupled with BIOPEP-UWM will be able to deliver a rapid method to identify and characterize proteins. This technique will reduce cost and time regarding the prediction of the potential bioactive peptides. Thus, this study generally aimed to characterize the isolated proteins from *C. sorokiniana* using in-gel digestion and proteomics techniques. Furthermore, the BIOPEP-UWM database tool was used to predict the potential bioactive peptides derived from the identified proteins of *C. sorokiniana*.

#### **2. Results and Discussion**

#### *2.1. Identified Proteins of C. sorokiniana*

The protein content of *C. sorokiniana* isolates was 65.08 ± 0.88% with a yield of 4.40% (w/w initial biomass dry basis). A total of the 10 distinct protein bands of *C. sorokiniana* proteins were observed in 12% acrylamide gel (Figure 1). These labeled protein bands (A–J) were used for the in-gel digestion and subsequently analyzed using nanoLC–nanoESI MS/MS. The molecular weights (MWs) of the proteins estimated by SDS-PAGE were 109.02 (A), 72.08 (B), 54.15 (C), 45.72 (D), 38.33 (E), 29.04 (F), 24.96 (G), 21.13 (H), 16.59 (I), and 7.82 (J). Eight protein hits were discovered from the selected bands in the NCBI database namely, chloroplast rubisco activase, 50S ribosomal protein L7/L12 (chloroplast), phosphoglycerate kinase, Fe-superoxide dismutase, heat shock protein 70, ATP synthase subunit beta (chloroplast), elongation factor 2, partial, and V-type H+ ATPase subunit A, partial. The protein hits, accession number from the NCBI database, length of amino acid (AA), and molecular weight of the band reported in NCBI database and estimated by SDS-PAGE were presented in Table 1. The estimated MWs by SDS-PAGE were comparable to the theoretical molecular weights reported in NCBI database except for elongation factor 2, partial and V-type H+ ATPase Subunit A, partial.

**Table 1.** Identified *C. sorokiniana* proteins by SDS-PAGE and nanoLC–nanoESI MS/MS analysis.


**Figure 1.** Twelve percent SDS-PAGE of *C. sorokiniana* protein isolates. M: Protein marker; CSPI: *C. sorokiniana* protein isolate.

*C. sorokiniana* is a freshwater green algae species with high protein content [28]. This species of genus *Chlorella* was originally called as *C. pyrenoidosa* [29]. The NCBI database revealed a total of 20,925 proteins from the *C. sorokiniana*. Most of them are enzymes responsible for various cell functions. According to the nanoLC–nanoESI MS/MS data, eight protein hits from the NCBI database corresponded to the proteins of *C. sorokiniana*. Phosphoglycerate kinase (*Auxenochlorella pyrenoidosa*, NCBI accession number: AKP17751.1) was detected in all selected protein bands in the SDS-PAGE. Watson et al. [30] stated that phosphoglycerate kinase derived from various protein sources are all monomers with MWs around 45 kDa. Moreover, their amino acid composition and catalytic functions are similar [31]. The reference molecular weight for phosphoglycerate kinase from NCBI database was 49.13 kDa, thus phosphoglycerate kinase found in band D (45.72 kDa) was used for further analyses. The possibility of having the same single protein in different bands is high since proteins are denatured and separated in SDS-PAGE. In addition, protein hits discovered by the Mascot database were identified based on the matched tryptic peptides detected by mass spectrometry.

#### *2.2. Identified Tryptic Peptides from C. sorokiniana Proteins*

Tryptic peptides derived from the identified proteins of *C. sorokiniana* by in-gel digestion were evaluated by nanoLC–nanoESI MS/MS analysis. Those tryptic peptides identified through mass spectrometry were peptides matching with protein hits in the Mascot database. Tryptic peptides are generated by trypsin through in-gel digestion process which is part of proteomics technique. The result of the MS/MS ion search for the tryptic digests revealed that all the tryptic peptides from the identified proteins were doubly and triply charged. Figures 2 and 3 present the representative spectra of the doubly and triply charged peptides from the identified proteins of *C. sorokiniana* by proteomics analysis.

Figure 2 illustrates the doubly charged tryptic peptide of *C. sorokiniana* of protein band D (Figure 1), with observed signal m/z 1053.01 marked in red box representing a doubly charged peptide (with adjacent signal difference of 0.50, insert A), and nanoLC–nanoESI MS/MS fragmentation spectra of NFNNIEDGFYISPAFLDK found in chloroplast rubisco activase (NCBI accession no. AEL29575.1) represented in insert B. Figure 3 illustrates the triply charged tryptic peptide also found in the same identified protein of the same protein band D. The observed signal was m/z 618.66, also marked in red box demonstrating a triply charged peptide (with adjacent signal difference of 0.33, insert A), and nanoLC–nanoESI MS/MS fragmentation spectra of tryptic peptide LVDAFPGQSIDFFGALR found in chloroplast rubisco activase (NCBI accession no. AEL29575.1) is illustrated in insert B.

**Figure 2.** NanoLC–nanoESI MS/MS spectra (m/z region 350–1600) of *C. sorokiniana* protein band D, m/z 1053.401 signal in red box. Insert A presents the identified doubly charged signal by the difference of 0.50 between signals. Insert B illustrates the fragmentation of nanoLC–nanoESI MS/MS spectra of the peptide NFNNIEDGFYISPAFLDK, calculated MW 2102.99 Da.

**Figure 3.** NanoLC–nanoESI MS/MS spectra (m/z region 350–1600) of *C. sorokiniana* protein band D, m/z 618.66 signal in red box. Insert A shows the identified doubly charged signal by the difference of 0.33 between signals. Insert B illustrates the fragmentation of nanoLC–nanoESI MS/MS spectra of the peptide LVDAFPGQSIDFFGALR, calculated MW 1851.95 Da.

In the identification of proteins by proteomics analysis, trypsin is usually used to digest proteins in the gel [32]. Trypsin hydrolyzes protein specifically at the C-terminus of the carboxyl side of the amino acids arginine or lysine, but poorly when lysine and arginine are followed by proline. With this perspective, the tryptic peptides are either doubly or triply charged in ESI since the amino terminal residues are basic which explains the result of the MS/MS ion search [33].

#### *2.3. Potential Bioactive Peptides from Identified Proteins in C. sorokiniana*

Potential bioactive peptides presented in identified proteins of *C. sorokiniana* were investigated using the BIOPEP-UWM database. Amino acid sequences of six proteins, namely chloroplast rubisco activase, 50s ribosomal protein l7/l12 (chloroplast), phosphoglycerate kinase, Fe-superoxide dismutase, heat shock protein 70 and ATP synthase subunit beta (chloroplast), were chosen as they are found to be relatively abundant components of *C. sorokiniana* proteins found in SDS-PAGE based on the results. Moreover, they also corresponded to the estimated molecular weights in the NCBI database (Table 1). The profile of the potential bioactive peptides, their biological activities (ACE inhibitory, antioxidant, anti-amnestic, antithrombotic, stimulating, regulating, DPP IV inhibitory), and frequencies are summarized in Table 2. Results revealed that most of the potential bioactive peptides were

dipeptides or tripeptides with multiple biological activities. The number of those bioactive peptides was identified based on the amino acid sequences which were predicted to become potential bioactive peptides. The BIOPEP database displays peptides with their bioactivities from inputted protein sequences corresponding to the information in the database.

**Table 2.** Number of potential bioactive peptides of identified *C. sorokiniana* proteins using BIOPEP's "profiles of potential biological activities" tool.


Abbreviation: ACE Inhibitory (AC), antioxidant (AO), anti-amnestic (AA), antithrombotic (AT), stimulating (S), regulating (R), dipeptidyl peptidase-IV (DPP IV) inhibitory (DPP).

Chloroplast rubisco activase (NCBI accession no. AEL29575.1) and phosphoglycerate kinase (NCBI accession no. AKP17751.1) were chosen to illustrate the profile of bioactive peptides within in the protein (Figures 4 and 5) because they appeared to be relatively abundant and were found in almost all picked bands in SDS-PAGE. As shown in Figure 4, the molecular weights of the tryptic peptides corresponded to the identified tryptic peptide at amino acid positions 100–117, 132–144, 158–181, 187–145, 149–213, 255–273, 290–304, and 310–326 (matched tryptic peptides shown in red letters) in the chloroplast rubisco activase amino acid sequence. BIOPEP-UWM analysis results exhibited that potential bioactive peptides encrypted in chloroplast rubisco activase amino acid sequence were mostly DPP IV inhibitors (with 250 peptide fragments, marked with an orange line) and ACE inhibitors (with 187 peptide fragments, marked with a green line). Other bioactive peptides found were 18 antioxidant, 3 anti-amnestic, 2 antithrombotic, 18 stimulant and 3 regulatory peptides. Some bioactive peptides have multiple activities such as VPL, WG, LA, IR, PG, VY, and KP.

On the other hand, Figure 5 shows the molecular weights of the tryptic peptides in phosphoglycerate kinase which corresponds to the theoretical tryptic peptides at amino acid positions 232–244, 258–268, 285–298, 303–313, 383–411, and 436–465 (matched tryptic peptides shown in red letters). There were 297 DPP IV inhibitor, 224 ACE inhibitor, 23 antioxidant, 33 stimulant, 5 anti-amnetic, 4 antithrombotic, and 6 regulatory peptides embedded in the amino acid sequence of phosphoglycerate kinase. Moreover, the profiles of the bioactive peptides of 50s ribosomal protein l7/l12 (chloroplast), Fe-superoxide dismutase, heat shock protein 70 and ATP synthase subunit beta (chloroplast) also show the presence of the above mentioned bioactive peptides in these proteins, except that Fe-superoxide dismutase does not show anti-amnestic, antithrombotic, or regulating peptides. In all the proteins, DPP IV and ACE inhibitors were the most abundant bioactive peptides.

The amino acid composition and sequence of the proteins greatly determines the presence of these bioactive peptides. Results also revealed that most of the DPP IV peptides present in the identified proteins had proline (P), alanine (A), glycine (G), valine (V) and leucine (L) amino acid residues. DPP IV preferably cleaves dipeptides with proline and alanine residues at the N-terminal side of the peptide [34]. It also has relatively lower cleavage rates with serine, glycine, leucine, and valine [35,36]. Moreover, the presence of basic and hydrophobic amino acids at the N-terminal side of the peptides could enhance the cleavage susceptibility of the substrate [34,37]. DPP IV inhibitors were also reported from various protein sources by in silico approach including barley, canola, oat, soybean, wheat, quinoa, chicken egg, bovine milk, bovine meat, pig, tuna, Atlantic salmon, chum salmon, tilapia skin and frame, and palmaria palmate [21,27,38–40]. On the other hand, the abundance of ACE inhibitory peptides in the identified proteins might have also been influenced by the amino acid compositions of

the proteins. The presence of amino acid residues such as phenylalanine (F), tyrosine (Y), tryptophan (W), or proline (P) in at the C-terminal side of the peptides have been reported to exhibit high potent ACE inhibitory activity [41–45]. The adjacent amino acid residue of proline can also influence the potency of the ACE inhibitor, which is usually enhanced by hydrophobic amino acids [46]. In silico analysis of different proteins revealed the abundance of ACE inhibitors embedded in various protein sequences [47–49]. In previous studies, some amino acid sequences of ACE inhibitory peptides from *C. sorokiniana* were discovered. Lin et al. [14] reported IC50 values of WV, VW, IW, and LW were 307.61, 0.58, 0.50, and 1.11 μM, respectively. Moreover, *C. sorokiniana* protein hydrolysates could reduce systolic and diastolic blood pressure at 20 and 21 mm Hg, respectively. Suetsuna and Chen [8] also mentioned several amino acid sequences generated potential antihypertensive activity through oral administration, such as IVVE (IC50: 315.3 μM), AFL (IC50: 63.8 μM), FAL (IC50: 26.3 μM), AEL (IC50: 57.1 μM), and VVPPA (IC50: 79.5 μM) from *C. vulgaris*; IAE (IC50: 34.7 μM), FAL, AEL, IAPG (IC50: 11.4 μM), and VAF (IC50: 35.8 μM) from *S. platensis*. Those findings showed that ACE inhibitory peptides predicted through in silico analysis obviously possessed potential antihypertensive activity through in vitro and in vivo analysis.

**Figure 4.** Protein sequence and potential bioactive peptides of chloroplast rubisco activase (AEL29575.1) from *C. sorokiniana*. Abbreviation: ACE Inhibitory (ACE), Antioxidant (AO), Anti-amnestic (AA), Antithrombotic (AT), Stimulating (S), Regulating (R), DPP IV inhibitory (DPP).

For many years now, in silico analysis has been successfully used to predict the potential application of various proteins as a source of bioactive peptides [22]. It provides sufficient information for determining the potential biological activity of proteins which is much faster than conventional methods [21]. The results of the in silico analysis by BIOPEP-UWM suggest the potential of *C. sorokiniana* proteins for pharmaceutical application as demonstrated by its bioactivities. These peptides in the intact proteins are inactive and need to be released in order to perform their functions [50]. The prediction of the potential bioactivities of the proteins after digestion by various proteases can be conducted by the BIOPEP-UWM database tool.

**Figure 5.** Protein sequence and potential bioactive peptides of phosphoglycerate kinase (AKP17751.1) from *C. sorokiniana*. Abbreviation: ACE Inhibitory (ACE), Antioxidant (AO), Anti-amnestic (AA), Antithrombotic (AT), Stimulating (S), Regulating (R), DPP IV inhibitory (DPP).

#### *2.4. Prediction of Potential Bioactive Peptides after Protease Cleavage using BIOPEP-UWM Tool*

Identified proteins such as chloroplast rubisco activase, phosphoglycerate kinase, Fe-superoxide dismutase, heat shock protein 70 and ATP synthase subunit beta (chloroplast) were further analyzed using the "enzyme action" tool in BIOPEP-UWM database; these proteins showed the most numbers of bioactivities from their profiles of bioactive peptides (Table 2). Results of the 15 simulations of enzymatic hydrolysis for each protein sequence are presented in Table 3. The table shows the number of bioactive peptides with specific bioactivities after hydrolysis of the individual proteins by various proteases. The results revealed that DPP IV inhibitory peptides were observed to be dominantly produced from the selected proteins using different proteases. ACE inhibitory peptides were also released in relatively high numbers but lower than DPP IV. This information is in concurrence with the profile of the potential bioactive peptides from the proteins in Table 2. Bromelain, papain, ficin and pepsin (pH > 2) were the individual proteases that released the most diverse and large number peptides with certain biological activities from all the selected proteins. Meanwhile, trypsin had the lowest number of bioactive peptides release after in silico hydrolysis. Trypsin is the most commonly used enzyme in proteomics approach [32], however, in the in silico analysis, it did not release significant numbers of potential bioactive peptides. Nonetheless, based on the results, other single action enzymes could also release relatively high numbers of bioactive peptides. The use of a combination of enzymes in hydrolysis is also offered by the BIOPEP-UWM database. A combination of two to a maximum of three enzymes could be utilized in the hydrolysis simulation of the proteins. The combination of three enzymes (trypsin, α-chymotrypsin, and pepsin) had been identified to produce potential anti-inflammatory peptides in microalgae, such as LDAVNR and MMLDF [12]. Table 3 reveals that the use of combined action of two to three enzymes could actually lead to the release of higher numbers of bioactive peptides from the selected proteins. This implies a greater effectiveness of using the combined action of enzymes in cleaving peptide bonds than the single action enzyme, except for pepsin which has almost the same number of released peptides with the combined enzyme action. Pepsin has been reported from several in vitro studies to produce various bioactive peptides from microalgae hydrolysates such as an ACE inhibitor and antioxidant peptides. Samarakoon et al. [9] mentioned that pepsin generated more potential ACE inhibitory peptides compared to other proteases, such as GMNNLTP (IC50: 123 μM) and LEQ (IC50: 173 μM). Ko et al. [11] also identified LNGDVW from peptic hydrolysates possessed strong scavenged peroxyl, DPPH and hydroxyl radicals at the IC50 values of 0.02, 0.92 and 1.42 mM, respectively. Moreover, peptic hydrolysates from microalgae efficiently generated strong antioxidant activities [51,52].

Furthermore, in comparison to the other three proteins, ATP synthase subunit beta demonstrated higher tendency to release more bioactive peptides using the different proteases. However, these theoretically produced bioactive peptides may not always have a comparable function with the in vitro and in vivo analyses, thus further study of these peptides using in vitro and in vivo studies should be conducted. Nevertheless, the BIOPEP's "enzyme action" tool was able to provide reference information on the possible bioactive peptides that could be released from the selected proteins using various proteases.


#### **3. Materials and Methods**

#### *3.1. Materials*

The microalgae, *C. sorokiniana* was obtained from the Taiwan Chlorella Manufacturing Co., Ltd. (Taipei, Taiwan), considered as the largest producer of *Chlorella* every year with an average production of 400 tons of dried biomass [1]. All reagents and chemicals used were analytical grade.

#### *3.2. Protein Isolation*

The protein isolation process was adapted from the procedure of Parimi et al. [53] with modifications. Briefly, *C. sorokiniana* biomass slurry at 1:16 (*w/v*) ratio was prepared. Sonication for 1 h was done to the slurry for pretreatment and subsequent alkaline protein extraction by solubilization at 11.38 using 2 M NaOH for 35 min with stirring. It was followed by isoelectric precipitation of the supernatant at 4.01 with 1M HCL and stirred for 60 min. Centrifugation at 8750× *g* for 35 min was done for the solid-liquid separation during the solubilization and precipitation steps. The protein isolate was lyophilized and stored at −20 ◦C until further use. The modified Lowry method [54] was used to determine the protein content of the isolate.

#### *3.3. SDS-PAGE Analysis*

The SDS-PAGE was performed according to a method described by Schägger and Von Jagow [55] 4% stacking gel (*w/v*) and 12% polyacrylamide gel (w/v). 10 milligrams of protein isolate was dissolved in 1 mL of denaturant sample buffer (0.5 M Tris-HCl pH 6.8, glycerol, 10% SDS, *w/v*, 0.5% bromophenol blue, *w/v*, β-mercaptoethanol), and heated at 95 ◦C. Then, 10 μL of the sample was loaded to the sample wells. Protein separation was carried at 80 V for 30 min followed by 110 V for 90 min for the resolving gel using a Mini Protean II unit (Bio-Rad Laboratories, Hercules, CA, USA). The gel was stained for 40 min with Brilliant Blue (Bio-Rad, Coomassie R250). Destaining of the gel was done three times using water/methanol/acetic acid (7/2/1, *v/v/v*) for 15 min each cycle with shaking using an orbital shaker (Fristek S10, Taichung city, Taiwan). Estimation of the molecular mass of proteins was done using molecular protein mass marker (250 to 10 kDa, Bio-Rad) loaded at 5 uL in the sample well. The gels was scanned with E-Box VX5 (Vilber Lourmat, Paris, France) and the analysis of the captured image was done using Vision Capt software (V16.08a, Vilber Lourmat, Paris, France).

#### *3.4. Proteomics Techniques*

#### 3.4.1. In-Gel Tryptic Digestion

The following proteomics technique experiments were carried out in Academia Sinica, Nangang District, Taipei City, Taiwan. Proteomics techniques were adapted from the methods described by Chang et al. [18]. Gel slice and in-gel digestion were performed using the combined modified methods of Rosenfeld et al. [56] and Shevchenko et al. [32]. Briefly, 10 intensive colored protein bands were excised from the SDS-PAGE gel for the in-gel digestion. The gel pieces were destained with 25 mM amonium bicarbonate (ABC)/ 50% acetonitrile (ACN) solution in a microcentrifuge PP tubes. The destained gel pieces were added with 100 μL of 50 mM dithioerythreitol (DTE) / 25 mM ABC and soaked at 37 ◦C for 1 h. The tubes were centrifuged and the DTE solution was removed. Then, the gel pieces were added with 100 μL of 100 mM iodoacetamide (IAM) / 25 mM ABC and soaked at room temperature in a dark place for 1 h for the alkylation step. The IAM solution was removed after centrifugation. Washing of the gel pieces was done by soaking in 200 μL of 50% ACN / 25 mM ABC for 15 min. The solution was removed after centrifugation and the process was repeated four times. The gel slices were then soaked in 100 μL of 100% ACN for 5 min, repeated twice, and the solution was discarded after centrifugation. The gel slices were dried for 5 min using Speed Vac (Thermo Scientific, Waltham, MA, USA). Trypsin digestion followed by adding Lys-C / 25 mM ABC (enzyme:protein, 1:50) and incubating the mixture for 1 h at 37 ◦C. Afterwards, the same amount of trypsin was added

and incubated for 16 h at 37 ◦C. Afterwards, the extraction of the tryptic peptides was done with 50 μL of 50% ACN/ 5% trifluoroacetic acid (TFA). The peptide extracts were transferred to new tubes and dried Speed Vac (Thermo Scientific, Waltham, MA, USA). Finally, the peptide extracts were purified using C18 Zip-Tip. The purified peptide extracts were used for the nanoLC–nanoESI MS/MS analysis.

#### 3.4.2. Nanoliquid Chromatography–Nanoelectrospray Ionization Tandem Mass Spectrometry (NanoLC–nanoESI MS/MS) Analysis

Dried tryptic peptide digest was subjected to nanoLC−nanoESI MS/MS analysis using a nanoAcquity system (Waters, Milford, MA, USA) connected to the LTQ Orbitrap Velos hybrid mass spectrometer (Thermo Electron, Bremen, Germany) equipped with a PicoView nanospray interface (New Objective, Woburn, MA). The tryptic peptide mixtures were loaded onto a 75 μm ID, 25 cm length C18 BEH column (Waters, Milford, MA) packed with 1.7 μm particles with a pore width of 130 Å. Separation was performed using a segmented gradient in 60 min from 5 to 35% solvent B (acetonitrile with 0.1% formic acid) at 300 nL/min flow rate and at 35 ◦C column temperature. Solvent A was 0.1% formic acid in water (*v/v*). The mass spectrometer was operated in the data-dependent mode. In brief, the orbitrap (m/z 350–1600) with the resolution set to 60 K at m/z 400 and automatic gain control (AGC) target at 10<sup>6</sup> was used to obtain the survey full scan MS spectra. The 20 most intense ions were sequentially isolated for collision-induced dissociation (CID) MS/MS fragmentation and detection in the linear ion trap (AGC target at 10,000) with previously selected ions dynamically excluded for 60 s. Ions with singly and unrecognized charge state were also excluded. The LTQ-Orbitrap data were acquired at the Academia Sinica Common Mass Spectrometry Facilities located at the Institute of Biological Chemistry, Academia Sinica, Nangang District, Taipei City, Taiwan.

#### 3.4.3. Tandem MS Data Analysis of Proteins and Peptide Identification

First, the MS raw data was converted to PKL files using de novo sequencing parameter in the ProteinLynx software coupled with Mascot MS/MS ion search (http://140.112.52.63/mascot/cgi/ search\_form.pl?FORMVER=2;SEARCH=MIS) [57]. MS/MS data were examined using the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov/) accessed on March 15, 2018 [58] for Viridiplantae (green plants) entries. Search parameters were set to: Carbamidomethyl cysteine as fixed modification; oxidation (M) as variable modification; 10 ppm peptide mass tolerance; 2+, 3+, 4+ peptide charge; ± 0.6 Da MS/MS tolerance; instrument is ESI-TRAP; and the enzyme entry as trypsin with 2 missed cleavages. Peptide masses were acquired as monoisotopic masses.

The Mascot ion score was −10\*Log (P), where P is the probability that the observed match is a random event. Individual ion scores of N 45 indicated identity or extensive homology (*p* < 0.05). Protein scores were derived from ion scores as a non-probabilistic basis for ranking protein hits (Matrix Science, London, United Kingdom). The sequence coverage of protein hits was expressed in percentage (%) indicating the sequence homology of identified tryptic peptides from *C. sorokiniana* to corresponding protein hits based on the Mascot MS/MS ion search results [21].

#### 3.4.4. In Silico Analysis of Bioactive Peptides and Enzyme Cleavages using BIOPEP-UWM Database Tools

Sequences of the identified protein of *C. sorokiniana* proteins from NCBI database were analyzed for bioactive peptides and enzyme cleavages using BIOPEP-UWM database (http://www.uwm.edu. pl/biochemia/index.php/pl/biopep) accessed on March 15, 2018 [23] performed as described by Cheung et al. [25] with modifications. Briefly, the bioactivities, sequences, number and location of the peptides were obtained from the sequences of the identified proteins analyzed using the "profiles of potential bioactivity" tool. Moreover, the sequences of the identified proteins were examined using the "enzyme action" tool to simulate enzymatic hydrolysis. A total of 15 enzymatic hydrolysis simulations (composed of 12 individual proteases, one double enzyme action, and two triple enzyme

action) were conducted to each protein sequence. A list of all the potential bioactive peptides was obtained after directing the theoretical peptide sequence data to the "search for active fragments" option. The occurrence of the frequency of the bioactive peptides in the intact proteins was computed as A = a/N, where A is occurrence frequency, a is the number of bioactive peptides and N is the total number of amino acid residues in the protein sequence.

#### **4. Conclusions**

Proteomics techniques coupled with in silico analysis used in this study showed a rapid method to identify the isolated proteins of *C. sorokiniana*, to predict potential bioactivities and to determine the appropriate proteases that theoretically released more bioactive peptides. Results of the proteomics technique showed the identification of tryptic peptides corresponding to eight proteins from the microalgae. The in silico analysis using BIOPEP-UWM database tools revealed that the combined actions of mixed enzymes and the use of single enzyme action of pepsin (pH > 2) could lead to the production of more diverse and larger numbers of potential bioactive peptides embedded in the protein sequences. According to the results, *C. sorokiniana* proteins are potential sources of bioactive peptides with various bioactivities. Nonetheless, with the use of appropriate extraction methods and purification techniques for certain predicted bioactivities, these proteins could be a good alternative source of high value compounds for pharmaceutical, medical, cosmetics and functional food applications to aid in human health maintenance and enhancement.

**Author Contributions:** Conceptualization, L.A.T., J.P.P., E.E.S.Y. and Y.-W.C.; methodology, L.A.T.; validation, L.A.T.; formal analysis, L.A.T.; investigation, L.A.T.; resources, L.A.T. and Y.-W.C.; data curation, L.A.T., J.P.P., E.E.S.Y. and Y.-W.C.; writing—original draft preparation, L.A.T.; writing—review and editing, F.C.A.P.; visualization, L.A.T.; supervision, J.P.P. and Y.-W.C.; funding acquisition, J.P.P. and Y.-W.C.

**Funding:** This research was funded by the Ministry of Science and Technology, Taiwan (MOST: 106-2311-B-019-001).

**Acknowledgments:** We thank Taiwan Chlorella Manufacturing Co., Ltd. (Taipei, Taiwan) for the donations of *C. sorokiniana* biomass samples used in this research.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Evolving a Peptide: Library Platforms and Diversification Strategies**

#### **Krištof Bozoviˇcar and Tomaž Bratkoviˇc \***

Department of Pharmaceutical Biology, Faculty of Pharmacy, University of Ljubljana, Aškerˇceva Cesta 7, SI-1000 Ljubljana, Slovenia; kristof.bozovicar@ffa.uni-lj.si

**\*** Correspondence: tomaz.bratkovic@ffa.uni-lj.si; Tel.: +386-1-4769-570

Received: 2 December 2019; Accepted: 25 December 2019; Published: 27 December 2019

**Abstract:** Peptides are widely used in pharmaceutical industry as active pharmaceutical ingredients, versatile tools in drug discovery, and for drug delivery. They find themselves at the crossroads of small molecules and proteins, possessing favorable tissue penetration and the capability to engage into specific and high-affinity interactions with endogenous receptors. One of the commonly employed approaches in peptide discovery and design is to screen combinatorial libraries, comprising a myriad of peptide variants of either chemical or biological origin. In this review, we focus mainly on recombinant peptide libraries, discussing different platforms for their display or expression, and various diversification strategies for library design. We take a look at well-established technologies as well as new developments and future directions.

**Keywords:** peptide; combinatorial library; library design; screening; mutagenesis

#### **1. Introduction**

Peptides are short polymers composed of 19 l-amino acid and non-chiral glycine residues, linked by amide bonds. The definition is rather vague in terms of chain length, although an arbitrary upper limit of 6000 Da has been assigned to label these molecules peptides, and polymers above that molecular mass are considered proteins [1]. They are ubiquitous in nature and have a role in most physiological processes as host defense (antimicrobial) agents [2], (neuro)hormones [3,4], and toxins [5]. Peptide research has experienced considerable development in the last decades, and over 7000 peptides have been identified in nature [6]. Today, peptides are widely used in drug discovery, drug delivery, food industry, cosmetics, and various other fields.

Peptides and small proteins isolated from natural sources have been used as medicines since the beginning of the 1920s [7], with bovine and pork insulin being the first ones. Transition to synthetic peptide drugs only began in 1950s with synthetic oxytocin and vasopressin entering clinical use subsequently. However, native peptides possess several drawbacks, most notably having poor oral bioavailability and very short plasma half-life, which have tempered enthusiasm for their use, instigating investigators to develop peptides with improved pharmaceutical properties [8]. But it would be one of the biggest breakthroughs in understanding the fundamentals of life itself, the discovery of the genetic code and how it translates to the amino-acid sequence, laying foundations for modern biotechnology, which really pushed the field in a whole new direction in the coming decades. Gene identification and manipulation techniques that developed rapidly from 1973 onwards allowed for producing large quantities of pure gene products [9]. The landmark event was the expression of the first recombinant peptide hormone insulin in *Escherichia coli* and the following approval to commercialize recombinant insulin in 1982.

Peptides are utilized broadly owing to their superiority in specific cellular targeting. They bind cellular receptors with high potency and great selectivity, lowering toxicity potential and occurrence of off-target effects. In addition, peptides in the body are degraded to amino acids, further lowering the risk of toxicity [10]. Chemical synthesis enables peptide fabrication in large quantities, chipping production costs compared to other biologics. More attributes include stability at room temperature and good tissue permeability. Furthermore, physico-chemical traits of peptides (e.g., solubility, hydrophobicity, and charge), metabolic stability, and their residential time in the body can be fine-tuned through chemical modifications. Reiterative chemical modification approach can be honed for development of peptide therapeutics with improved properties [11], including extraordinary target affinity [12].

Areas of the highest concentration of peptide development in medicine are metabolic diseases, oncology, and cardiovascular diseases, not surprisingly, all areas of highest interest to the pharmaceutical industry. By 2018, more than 60 peptide drugs (excluding insulins and other small proteins) have been approved in the US, Europe, and Japan, over 150 were in active clinical development, and an additional 260 were assessed in human clinical trials but did not make it to the market [8]. The peptide therapeutics market was valued at 19,475 million USD in 2015 and it is estimated it will more than double the value by 2024, reaching 45,542 million USD [13].

During the past decade, peptides have also been used in a wide range of applications in other fields. They are found in biosensor applications as biorecognition molecules and are conjugated with transducers or molecular beacons that aid signal detection [14,15]. Additionally, they serve as surfactants or tags promoting solubility of recombinant intrinsic membrane proteins [16–20], increasing their yield, activity, and aiding protein structural studies. Peptides are even replacing enzymes in catalytic reactions [21] and substituting proteins as ligands in affinity chromatography [22,23].

Discovery and design of novel peptides can be guided by various strategies. In this review, we focus mainly on the use of peptide and peptide aptamer [24] (sequences of 5–20 amino acid residues, grafted into loops of a robust protein scaffold) libraries generated through recombinant DNA technology, but discuss chemical peptide libraries as well.

#### **2. Combinatorial Peptide Libraries**

Peptides of great number and diversity occur as a natural form of combinatorial chemistry. Conversely, exploiting evolutionary principles in the laboratory by constructing and screening large peptide libraries can yield new lead compounds with desired traits. The discovery of novel binders is a multifaceted process involving scanning of thousands or even millions of potential candidates from combinatorial libraries using in vitro screening analysis, commonly used in target-based drug discovery. Target-based drug discovery (sometimes called "reverse pharmacology") is the opposite of a traditional phenotypic screening strategy. The latter typically leads to the identification of molecules that modify a disease phenotype by acting on previously unidentified target [25]. In contrast, the targets in the target-based approach are well defined. With the molecular target in hand, discovery of novel binders can be facilitated by utilizing crystallographic and biochemical studies, computational modeling, binding kinetics, and mutational analysis to gain insight into how the target and the ligand interact and thus enable efficient structure-activity (SAR) analysis and the development of future generations of binders [26].

Combinatorial peptide libraries can be categorized into two groups—chemical peptide libraries, which are produced via organic synthesis, and biological libraries. Choosing a library platform should be guided by practical manners. Importance of library size, the experience of operators, available equipment, and other technical considerations may well limit the choice [27]. In principle, library-based peptide discovery adheres to the following paradigm: (1) creation of a pooled peptide library, (2) screening of the library against the target molecule and isolation of hits, and (3) hit identification.

Various screening/selection methods are at disposal depending on the peptide library platform. Normally, screening peptide libraries involves incubating the library with a fluorescently labeled soluble target or target-coated magnetic beads, followed by flow cytometry-based systems such as fluorescence activated cell sorting (FACS) [28], or magnetic separation techniques like magnetic-activated cell sorting (MACS) [29], respectively. The former is mostly used for cell-based peptide libraries, although it has

also been used for screening chemical library systems such as one-bead-one-compound platform [30] (see below). Hit identification is also dependent on the library type; either iterative deconvolution or positional scanning methods are used for chemical libraries, while sequencing is typically employed for DNA-encoded platforms. In recent years, next-generation sequencing (NGS) methods, capable of massively parallel nucleic acid sequence determination, have transformed the field of screening biological libraries, enabling detection of low abundant clones and quantification of changes in clone copy numbers without performing many rounds of selection. This technology has been paired with various library platforms, immensely enhancing the throughput of these methods [31–38].

#### *2.1. Chemical Peptide Libraries*

In chemical approach, the library synthesis is either completed on a solid support (insoluble porous polymeric resin) and then members are cleaved to be screened as free compounds, or the library is synthesized and screened on a solid matrix.

Solid-phase peptide synthesis (SPPS) was first achieved by the Nobel laureate Robert B. Merrifield. The general approach of SPPS is to attach the first amino acid to a solid support through its carboxyl group. Subsequently, each N-protected amino acid is then added in turns. During coupling, the carboxyl group of the incoming amino acid must be activated, which is commonly achieved by using carbodiimides, amino acid halides, uronium (guanidinium N-oxides), or phosphonium salts [39,40]. After each addition, the N-protecting group must be removed before the next amino acid is added. The most used N-protecting group is the fluorenylmethyloxycarbonyl (Fmoc) group [41] which can be orthogonally removed under basic conditions. Since the growing peptide chain is attached to a surface, removing waste products of synthesis is accomplished by simple washing. Combinatorial synthesis is inherently a parallel synthetic process where a single product is obtained in each different reaction flask (or a sealed permeable container ("teabag") submerged in a reaction flask) [42]. Alternatively, a mixture of products can be obtained in a single flask via the "mix-and-split" method (see below) [43]. The whole synthesis process is quantitative, as the reactions are driven to completion by applying reagents in excess at every step. In addition, as the growing peptide chain is attached to a matrix, there is no need for isolation of intermediates. Nevertheless, judicious use of resources should be considered as wasting large amounts of reagents is costly and the surplus effluent produced can be an environmental burden [1], although this is a concern mainly when producing large quantities of product and less so for library generation.

For generating large libraries, the "split-and-mix" is the preferred strategy. This cyclic scheme of SPSS was first demonstrated by Furka et al. [43] in 1991 and involves coupling of individual amino acids to resin beads, mixing the beads together, separating them in equal portions, and then reacting each portion with a different amino acid (Figure 1). Mixing, separating beads, and the reaction step are repeated until the desired peptide length and diversity are achieved. An important virtue of this method is that a single bead contains a single peptide sequence, which is why the libraries produced in this way are termed one-bead-one-compound (OBOC) [44]. The second strategy is called "pre-mix", where all the amino acids to be coupled at each synthetic step are premixed in equimolar ratio and then reacted with a single resin batch. To overcome the problem of different kinetics rate of each amino acid coupling, "smart" mixtures of molar ratios that correspond to different coupling rates were proposed [45].

**Figure 1.** Synthesis of combinatorial peptide libraries by the "split-and-mix" method. Carrier beads are split into aliquots (only 3 are shown for clarity) for coupling individual amino acid residues, pooled, and the process is iterated until the desired length of peptides is achieved. Library diversity increases exponentially with each coupling step. At the end, each bead carries a single peptide sequence; hence, the name "one-bead-one-compound" combinatorial library. Libraries are typically screened by incubating the beads with a fluorophore-labeled target and subsequent fluorescence-activated sorting.

Various deconvolution methods have been developed for screening and identification of hits from chemical peptide libraries; most commonly adopted are iterative deconvolution [46] and positional scanning [47]. Iterative deconvolution is based on dividing the library into non-overlapping subsets containing peptides with defined residues at the specified position(s) (while having the rest of the structure randomized). Each subset is then screened separately. The most active subset of compounds is further divided into new subsets (retaining the identified optimal residue(s) from previous screening round) and retested for activity. This process is continued until the fittest molecule is identified [42]. In positional scanning, sub-libraries individually address each diversity position. This position is defined with a single amino acid residue, while the remaining positions are fully randomized. Positional sub-libraries are assayed in parallel to gather information on the optimal residue for each diversity position, consequently identifying the fittest member [47]. Other hit identification techniques include Edman degradation [48] and mass spectrometry (MS)-based strategies [49,50]. Mass spectrometry is highly sensitive and very fast, and is the method of choice when it comes to generating sequence data in the femtomolar range. Edman degradation, although reliable, only becomes a viable option at picomolar quantities [51]. It is also slow and not easily amenable to multiplexing [52]. In contrast, mass spectrometry has the edge in analyzing mixtures of great diversity, making HPLC separation of peptides unnecessary. However, a complete peptide fragmentation pattern is required in order to unambiguously identify each amino acid sequence [50].

The cutting-edge technology in chemical combinatorial libraries is the so-called DNA-encoded library (DEL) platform. It works by tagging the library members via an adapter module to a double-stranded DNA barcode which enables the unambiguous identification of retained compounds at the end of the selection process [53]. Both the library synthesis and tagging rely on the "split-and-mix" approach. In each synthesis cycle, the process is encoded (recorded) by ligation of a short DNA tag that identifies the amino acid residue added. A pooled library is then assayed by affinity selection and selected binders can be easily identified by PCR amplification and sequencing of tags [54,55]. Some peptides identified using the DEL platform include carbonic anhydrase binders [56], ligands of integrins, and CCR6 [57]. Generation of macrocyclic peptide libraries by DEL has also been described [55,58]. DEL offers several benefits over some other established methods, like phage display

(see below), for discovery of peptide ligands. Library sizes (i.e., diversity) are larger, allowing the chemical space to be more deeply sampled in a single screen. Furthermore, DEL platforms are compatible with parallel screening against multiple targets, enable activity-agnostic screening for identification of "silent binders", and consume relatively low amounts of reagents [54]. In addition, not only individual screening hits are identified, but rather, larger families can be detected, in which related building blocks or combinations thereof are enriched [59]. Limitations of the platform include possible interference of the oligonucleotide tag in target–binder interaction that may cause steric hindrance. The barcode can also restrict the extent of possible chemical reactions and influence the properties of binders. Resynthesis after selection must be performed for subsequent target binding validation, which adds to the duration of the assay [54,60].

#### *2.2. Biological Peptide Libraries*

A key feature of biological libraries is the linkage of the genotype (i.e., genetic information) with its corresponding phenotype (the encoded peptide/protein). In directed evolution, (random) mutagenesis (gene diversification) and screening for functional gene products are iteratively alternated, and the phenotype:genotype link is crucial for peptide identification via sequence determination of the encoding oligonucleotide. Based on the display type approach, biological libraries can be categorized as either cellular or acellular [61].

#### 2.2.1. Cellular Approach

The main bottleneck to this approach is a transformation step needed for delivering a DNA library into host cells, providing transcriptional and translational machineries for gene expression. One of the most widely used and recognized methods is phage display, first described in 1985 by George P. Smith [62]. Smith shared half of the 2018 Nobel Prize in chemistry (the other half was awarded to Frances H. Arnold) with Gregory P. Winter for the phage display of peptides and antibodies, validating the gargantuan importance of this technology. Filamentous (bacterio)phages (most commonly used vehicles in phage display) are rod-shaped viruses that infect *E. coli*, and are composed of coat (capsid) proteins that encapsulate the phage genome. Surface display is achieved by inserting a peptide-encoding oligonucleotide sequence into one of the genes for a capsid structural protein [63–65]. All 5 filamentous phage coat proteins have been exploited as anchors for display of foreign peptides and proteins. The filamentous phage's minor coat protein p3 is the most widely used and can present 3–5 copies per virion, seconded by the major coat protein p8 with a much larger count, around 2700 copies. The filamentous phage M13 and the closely related fd are most commonly used due to their ease of manipulation and their ability to accommodate fairly large pieces of foreign DNA. Two main display types emerged for displaying libraries on filamentous phage: the polyvalent (one-gene-system) and the monovalent display (two-gene-system). In the polyvalent display, the DNA fragments are inserted into the phage vector, producing fusions with each copy of the chosen coat protein [65]. As opposed to monovalent display, polyvalent systems yield binders that exhibit reduced affinity due to avidity effects, enriching weak (albeit specific) ligands over the course of a multi-step affinity selection process [66]. The monovalent display uses a phagemid vector—essentially a plasmid encoding the foreign peptide-coat protein fusion, and harboring the phage origin of replication and the packaging signal for production of single-stranded DNA copies and assembly into phage particles, respectively. A helper phage (which supplies the rest of phage genes, but has assembly defects) is employed to superinfect the cells harboring the phagemid. Thus, the produced virions display a mixture of recombinant coat proteins, encoded by the phagemid, and the cognate wild-type proteins, encoded by the helper phage [27]. Although two-gene-systems are mainly utilized for displaying larger proteins, both systems have been used to display relatively short peptides [67]. Less commonly, lytic phages, such as T7 or lambda, have been used for phage display of peptides [64,68]. Another interesting phage display-like system allowing tunable display valency is based on virus-like particles (VLPs) of the RNA bacteriophage MS2 [69]. In phage display, target-binding peptides are identified through

affinity selection process called panning (Figure 2). This technique is comprised of several steps. First, surface-immobilized target is contacted with phage library, followed by stringent washing to remove unbound and non-specifically bound clones. Specific binders are then eluted, usually by applying pH shock or high salt solutions, and subsequently amplified in host cells. Obtaining binders with high affinity usually requires 3–5 panning rounds. There is a reason why phage display is the method of choice for so many. Phage libraries are more readily affordable compared to using other microbial/cell display vehicles (see below), can easily be amplified by allowing library phage to replicate in a bacterial host. Virions can withstand various selection environments, endure harsh washing and elution conditions, and can be stored at –80 ◦C for years. Phage library diversities are typically significantly larger compared to chemical peptide libraries and can reach up to 10<sup>11</sup> clones. Although phage can accommodate a large variety of (poly)peptide structures, amino acid and sequence biases do occur (as with any biological library), leading to inherent library diversity decrease [70]. In filamentous phage, this is due to censorship of charged sequences through the general bacterial secretory (Sec) pathway. Capsid structural proteins fused to charged peptides seem be to less efficiently inserted into the *E. coli* inner membrane, hindering virion assembly [71]. Conversely, lytic phage vehicles have the advantage of not being limited to display of peptides which are efficiently translocated to the periplasm [72]. To display (poly)peptides that fold rapidly in the cytoplasm (and are thus inefficiently transported to periplasm) on *filamentous* phage, the twin-arginine translocase (Tat) export pathway has been exploited instead of the conventional Sec pathway [73,74]. In contrast to Sec translocase complex that mediates export of unfolded proteins through the inner bacterial membrane, Tat only exports fully folded proteins. Another drawback of phage display is the non-quantitative nature of clone screening, although this problem can be elegantly tackled by deep sequencing [31] instead of traditional clone picking. In conventional phage display, the displayed peptides are limited to natural L-amino acids. All-D peptide ligands, showing improved metabolic stability, can be developed using the mirror-image phage display strategy [75]. Cyclization of phage-displayed peptides [76] represents another popular method of augmenting protease resistance and simultaneously increasing affinity (also see Section 3). In addition to in vitro or ex vivo (e.g., against isolated cells), phage library selections can also be performed *in vivo*. The latter approach is used to identify tissue homing peptides for cell-specific targeting of therapeutics [77]. In contrast to phage library pannings to purified targets, selections against intact cells and in vivo screens are inherently more complicated as the library is exposed to many potential decoys, leading to high background [63]. When panning against a cell surface receptor, one typically relies on a cell line ectopically overexpressing the targeted membrane protein. This allows for so called *subtractive* or *negative* selections to be performed against the same cells devoid of the target of interest before contacting the library phages with target-transfected cells in each selection round, thereby effectively reducing background binders. In in vivo selections, phage library is typically injected intravenously or directly into tumor xenografts of an experimental animal. Upon systemic application, library phages are likely not distributed uniformly to all tissues, but are rather retained in the circulation, being primarily exposed to vascular endothelium. Targeting other tissues requires fairly large phage titers; phages can be rescued from tissue of interest, and amplified for iteration of selection. Deep sequencing enables analysis of the entire repertoire of retained phages [78], and should thereby allow the deduction of specific binders even from single-selection-round experiments [79], especially if independent parallel screens are performed and the same peptides are observed enriched in each case. To limit background phages, vascular perfusion can be harnessed.

**Figure 2.** Iterative principle of phage display library screening. Recombinant phage DNA is packed into viral particles in vitro (common for T7- and lambda phage-based systems) for transduction of genetic library into host bacteria (**1**). Alternatively, phage DNA can be electroporated into host cells (typical for filamentous phage-based systems; not shown). Exploiting bacterial transcription and translation mechanisms, progeny virions displaying foreign peptides are amplified and assembled (**2**), and released into growth medium (**3**). Amplified library is isolated and purified, and contacted with an immobilized target (**4**). Non-bound clones are removed by stringent washing, while those retained due to target:displayed peptide interaction are eluted and collected for amplification in host bacteria before being subjected to further selection rounds.

Several peptide display systems based on Gram-negative bacteria were reported. Typically, peptides are grafted in the surface-exposed loops of outer membrane proteins. Fusion partners for peptide library display include the *E. coli* OmpA [80], OmpX [81], and the *Pseudomonas* OprF [82] among others. It is also possible to display peptides on extracellular appendages such as pili [83] and flagella, albeit only the latter has been used for library construction [84,85]. Conventional bacterial surface display libraries are commonly screened using FACS. Library display and selection have also been demonstrated in the periplasm of *E. coli*. In the intra periplasm secretion and selection (PERISS) system, the target molecule gene and peptide library DNA are integrated in tandem into a plasmid, expressing the peptide library in the periplasmic space, while the target molecule is incorporated into the *E. coli* inner membrane. The outer membrane is disrupted with ethylenediaminetetraacetic acid (EDTA) and the cell wall is degraded enzymatically to form spheroplasts. Peptides interacting with the target are collected as binding complexes on magnetic beads through a fusion tag, and identified through PCR amplification and sequencing [86]. Another form of periplasmic display is the "anchored periplasmic expression" (APEx) technique, where a peptide library is expressed and immobilized in the periplasm of *E. coli* via fusion to a lipoprotein targeting motif that enables anchoring to the inner membrane. The outer membrane and the cell wall are then removed, followed by incubation with the fluorescently tagged target molecule, washing, and binder detection by FACS [87,88]. Furthermore, peptide libraries were expressed in the cytoplasm of bacteria via fusions to the C-terminus of the *lac* repressor. The repressor protein binds to *lac* operator sequence on the plasmid encoding the peptide, providing a physical genotype-phenotype link. The library is screened by affinity purification with an immobilized

receptor [89]. Screening techniques of the latter are inferior compared to surface display, as the cells must be lysed prior to panning to the immobilized target. Another important technique, the bacterial two-hybrid system, is used for studying protein–protein interaction (PPI). It can be used to screen libraries of peptides to probe and manipulate biological pathways [90]. The premise of all variants of two-hybrid platforms is the identification of target binders via reconstitution of reporter's activity in vivo dependent on the interaction between a pair of mediator proteins. Typically, the target protein ("bait") is fused to the transcriptional activator's DNA-binding domain (interacting specifically with the upstream activating sequence of a reporter gene), while the library (poly)peptide variants ("prey") are fused to the transcriptional activation domain (which recruits the RNA polymerase). Another option is the functional complementation of a split two-domain enzyme adenylate cyclase upon bait-prey interaction, leading to indirect reporter gene activation [91]. Interactions between protein pairs are identified by growing library bacteria either on color indicator or selective media plates [92]. In general, assets of bacterial combinatorial libraries are fast bacterial growth rate, ease of genetic and physical manipulation, and (in case of surface display and periplasmic expression) highly efficient screening protocols based on FACS. On the other hand, display platforms may suffer from interference of the complex bacterial surface with the selective peptide:target recognition. A recently developed peptide library system termed "surface localized antimicrobial display" (SLAY) [93] relies on a "reversed" screening protocol to identify peptides with antimicrobial activity. In SLAY, library plasmids are transformed into bacteria of interest and induced to express the encoded peptides. The peptides with antibacterial properties will lead to bactericidal or bacteriostatic effects, eliminating these clones from the population. Using NGS and in silico techniques, sequences pre- and post-induction are analyzed to identify members with antimicrobial properties.

Phage and bacterial display can be classified as prokaryotic display. There are also vast possibilities of displaying peptides and proteins in eukaryotic systems. Their main advantage is the reliable folding and glycosylation of eukaryotic proteins [94]. One of the most widely used organisms in this category is *Saccharomyces cerevisiae*. Yeast has been utilized as a vessel for numerous types of peptide library screening. Like the bacterial two-hybrid system, the yeast two-hybrid system is used for investigating protein–protein interaction through the activation of reporter genes responding to a reconstituted transcription factor [95,96]. This particular method gave rise to a number of adaptations to study PPIs, such as the *ras* recruitment system [97], split ubiquitin system [98], and the yeast three-hybrid system [99], to name a few. Conversely, yeast surface display is achieved by fusions to a cell surface-anchored protein, followed by screening and selection through magnetic separation or FACS [100]. Foreign peptides are commonly fused to the C-terminus of α-agglutinin Aga2p subunit, a surface protein covalently bound to glucan, which mediates cell to cell adhesion during yeast mating. Aga2p is linked to the Aga1 protein through two disulfide bridges, resulting in a covalent complex on the surface of the yeast cell [101]. Other surface anchor proteins used in this manner are Agα1p, Cwp1p, Cwp2p, Tip1p, Flo1p, Sed1p, YCR89w, and Tir1, a choice that depends on the type of protein/peptide to be displayed [102,103]. Another type of display in yeast is possible via secretory expression [104]. A number of novel ligands have been discovered via screening of peptide or small protein libraries expressed in yeast, both on the surface and intracellularly. Examples of surface display libraries include cysteine knot peptides (knottins) [105] and lanthipeptides [106], whereas head-to-tail cyclized peptide libraries [96,107] (see the SICLOPPS method description below) are expressed intracellularly. The main advantage of yeast display is its eukaryotic protein expression mechanism, which allows for complex post-translational modifications, and quantitative library screening through FACS [108,109]. Disadvantages include smaller library sizes due to low transformation efficiency [110], and lower affinity caused by unintended multivalent binding to oligomeric targets, although this can be surmounted by applying kinetic selections [111]. Yeast also form high-mannose type glycans that render glycoproteins produced in this system unfit for human applications, but this problem can be tackled by humanizing yeast glycosylation pathways [112].

Another platform well-suited for (poly)peptide display is the baculoviral particle (or baculovirus infected cell) system. Baculoviruses, enveloped viruses infecting invertebrates, are distinguished by a large packaging capacity, and (being eukaryotic pathogens) support diverse post-translational modifications [113]. Typically, foreign peptides or proteins are fused with the major envelope glycoprotein, such as the gp64 of the baculovirus *Autographa californica* multiple nuclear polyhedrosis virus (AcMNPV), and the recombinant fusions are embedded in the viral envelope (and the plasma membrane of infected insect cells) along with the wild-type glycoprotein. Other membrane anchoring strategies can be exploited to display a wide range of proteins and peptides at diverse valency range [114,115]. Regardless of the display strategy, libraries on infected insect cells are screened with FACS [116], and libraries displayed on the AcMNPV virions may be selected via conventional affinity panning [117]. In a prominent example, baculovirus-displayed libraries of peptides bound to the class I or II major histocompatibility complex (MHC) have served for identification of T cellular receptor peptide antigen mimetics (mimotopes) [118,119]. Combining the benefits characteristic for prokaryotic platforms (e.g., simple viral propagation as insect cells do not require CO2 exchange for growth, high transfection rates, and the low-risk biosafety profile [114,120]) with clear eukaryotic advantages (efficient protein folding and a plethora of accessible post-translational modifications), the baculoviral display platform seems underappreciated. On the other hand, downsides of using this technology are expensive growth media [121], similar but still different glycosylation patterns compared to those of mammalian systems [122] and time-consuming cloning in the baculoviral vector, although new systems to avert this lengthy step have been developed (reviewed by Possee and King [123]).

Peptide libraries have also been displayed by using several eukaryotic RNA viruses with inserting short peptides into their native envelope proteins without interfering with the viral infectivity. Examples of eukaryotic retroviral vehicles include the avian leukosis virus [124] and feline leukemia virus [125,126]. Alongside retroviruses, the adeno-associated virus (AAV) has been used as a platform for peptide library display. Here, capsid diversification combined with phenotypic screening was primarily used as a means of achieving re-targeted tropism [34,127–131]. Normally, AAV libraries are produced in human cells, such as HEK 293T or HeLa [132]. AAV library construction starts by inserting encoded capsid gene variants (in place of previously excised wild-type *cap* gene sequence) in a template vector encoding a complete AAV genome. Because the natural vector tropism is disrupted, the rare subpopulation of library peptides are expected to redirect virions to new, formerly inaccessible cell types [133]. Library screening involves passaging the virions over several rounds in the chosen cell type to enrich for capsid variants with higher transduction efficiency and/or specificity compared to the naïve library background [134]. AAV libraries are primarily considered for their modus operandi—they infect humans and can thus be exploited as target-specific gene therapy vectors. AAV does not seem to cause any diseases and is only weakly immunogenic. Many challenges face AAV library platform, most notable being an uncertainty of capsid-genome correlation due to the possibility of co-transfection with different AAV capsid variants, resulting in library members with multiple phenotypes [132].

Peptide libraries have also been displayed on the surface of mammalian cells. Examples include fusions to the chemokine receptor CCR5 for antibody mimotope selection [135] and cystine-dense peptides for difficult to drug targets [36]. On the other hand, peptide libraries have been deployed for intracellular screens in mammalian cells [136–139]. Besides the benefit of authentic post-translational modifications, the most conspicuous advantage of mammalian display libraries is the ability to screen against targets and to investigate PPIs in their native environment. On the other hand, obvious downsides of working with mammalian cells are high cultivation costs and laborious cultivation technologies.

The "split-intein circular ligation of peptides and proteins" (SICLOPPS) is a method for cyclic peptide library generation. It takes advantage of intein splicing to generate peptide libraries in cells [140]. Inteins are self-excising protein domains that process to link their consecutive sequences with a peptide bond, while liberating the intervening portions (exteins) as head-to-tail cyclized peptides [141]. Library peptides to be cyclized are usually 6 amino acids long randomized exteins, flanked by C- and N-terminal intein domains with splice site-adjacent cysteine residues. Propagation of plasmids harboring SICLOPPS expression cassettes in cells resolves the problem of genotype:phenotype linkage. As the libraries are assembled in cells, this approach is particularly well-suited for functional assays; both function and affinity (to a large repertoire of accessible intracellular targets) for each binder can be assessed. Other benefits include simplicity, speed, and ease, and its applicability to different organisms [107,142,143] make this high-throughput approach the platform of choice. On the other hand, the main bottleneck of SICLOPPS is the limited library size of 107-9 (determined by transfection efficiency) [140,144,145].

#### 2.2.2. Acellular Approach

In contrast to the cellular approach, the acellular libraries are not propagated in living cells. These systems are based on in vitro transcription and translation, and one of the main advantages of this approach is the potential to generate high diversity libraries with up to 1015 individual entities by averting cell transformation step. Also convenient is the ability to apply stringent screening conditions that would be incompatible with cell viability.

In ribosome display, peptide:ribosome:mRNA complexes are generated, physically associating nascent peptide to their encoding mRNA. This is achieved by eliminating stop codons, consequently stalling both the nascent peptide and its encoding mRNA on the ribosome. The method consists of the following steps: DNA library preparation, in vitro transcription and translation, affinity selection and mRNA recovery, followed by reverse transcription and PCR amplification for sequencing or further selection rounds [146]. Many types of peptide libraries have been screened through this platform, including tumor targeting peptides [147], peptides that bind to monoclonal antibodies [148,149], streptavidin ligands [150], and metal binding peptides [151], to name a few. Besides its simplicity and the obvious ability to create high diversity libraries, another key advantage of the technology is that the diversity of the libraries can be easily manipulated by introducing new mutations at any selection step, and is therefore particularly suited for directed evolution projects [152]. With the advances such as PURE (protein synthesis using recombinant elements), ribosome display has also become very stable, as it is composed only of reconstituted ribosome and translation factors, eliminating degradation of mRNA and nascent (poly)peptides by endonucleases and proteases from the cell extract [153]. Another detriment that is being resolved is the thermal stability of this platform; usually, ribosome display is performed at near freezing temperatures. Fusing the library peptides to the Cv RNA-associating protein (Cvap) and adding the Cv RNA motif to the 5' mRNA end renders the peptide-Cvap:mRNA:ribosome complexes stable at room temperature, even for prolonged time [151,154]. However, the fragile noncovalent phenotype:genotype conjugation still requires mild selection conditions.

Another technique, mRNA display, is similar to ribosome display. The essence of this technology is the employment of puromycin, an antimicrobial product that inhibits translation by mimicking the 3'-end of an aminoacyl-tRNA, conjugated to the 3' terminus of mRNA [155]. When the ribosome reaches the 3' end of the template, the newly synthesized peptide is transferred onto the puromycin, achieving genotype–phenotype linkage with its encoding mRNA. Later, affinity panning is the preferred practice for binder identification. As in ribosome display, after each selection round, mRNA is reverse transcribed to cDNA and PCR amplified for the purpose of sequencing or further selection rounds. This platform has been extensively used and there are numerous examples of discovered novel peptide binders. These include cyclic peptide therapeutics targeting GPCR signaling [12], streptavidin binding peptides [156], and peptide vaccines [157]. Compared to ribosome display, this platform is superior and enables stringent selection conditions [158]. A major advantage over ribosome display is obviously the absence of the ribosome, a huge 2,000,000 Da ribonucleoprotein complex, which can interfere with the selection process [159]. Despite its unique ability in addressing a repertoire of different biological problems, mRNA display has limitations like any other display technology. One of the concerns involves interactions of covalently bound mRNA with the displayed peptide or the target

molecule. The interference of flexible mRNA is particularly problematic when dealing with proteins that nonspecifically bind nucleic acids [158]. Furthermore, the highly negatively charged mRNA fusion moiety may interfere with positively charged target molecules [160].

A major problem in both ribosome and mRNA display is mRNA lability. This was the driving motivation behind the development of the cDNA display method. Here, mRNA is ligated to a looped DNA linker, harboring a primer region for reverse transcription and a restrictase cleavage site, and carrying a loop-attached biotin group and a 3' terminal puromycin moiety. Thus, the translated mRNA-encoded peptide is transferred to the DNA linker, and the complexes are captured on streptavidin beads for reverse transcription. Finally, the library is released from the solid support by restrictase treatment [161]. This refined platform has been used in screening for cysteine-rich peptides with antagonistic activity at interleukin-6-receptor [162], peptide antagonists of growth hormone secretagogue receptor [163], amino group binders [164], binders of vacuolating toxin [165], and peptide agonist and antagonist of angiotensin II type-1 receptor [166]. The problem of RNA instability instigated the design of another DNA-based approach called CIS display. This method harnesses the so-called cis activity—the capacity of a DNA replication initiator protein (RepA), fused to the library peptides, to bind exclusively to the cognate template DNA [167]. CIS display has been applied for identification of binders of antibodies [167], protease-resistant peptides [168], and ligands of human vascular endothelial growth factor receptor [169].

Another acellular platform dubbed "in vitro compartmentalization" (IVC) mimics natural encapsulation of living cells by entrapping a DNA library and a transcription/translation reaction mixture in water-in-oil emulsion droplets. These "microreactors" serve as boundary analogs to a cell membrane, assuring an effective genotype–phenotype linkage; on average, each droplet contains a maximum of one library member. The in vitro transcription/translation paraphernalia within the droplets processes the genes, and the desired phenotype is selected using a suitable strategy [170]. For example, the DNA can be linked to a substrate which is converted into a marker product by the encoded enzyme, rendering it detectable by FACS [171]. It has been reported, that 1 mL of such an emulsion can hold as many as 1010 droplets [170]. Adaptations of this technique (e.g., STABLE [172], SNAP [173]) have been summarized in a recent review [174]. Although IVC is a technology especially handy for in vitro enzyme evolution [175], it has also been applied for screening peptide libraries [176,177]. Possessing benefits of the above acellular platforms, this technology is superior as reactions can be controlled inside the droplets, by adding reaction constituents in a step-wise fashion at defined time-points (e.g., after in vitro translation) [178]. IVC is also suitable for quantitative screening using FACS [179]. There are technical limitations to this platform, and the one that stands out is the formation of double emulsion droplets with the consequence of severing the genotype–phenotype link, although this could be solved by employing a high-throughput screening platform using droplet-based microfluidics [180]. IVC libraries are also considerably smaller than mRNA-display libraries [181].

#### **3. Incorporating Unnatural Amino Acids and Constraints into (Library) Peptides**

The methods described above have become invaluable for the discovery or refinement of peptide binders. Still, low-molecular-weight leads are generally preferred over peptides in drug discovery, owing to peptides' inadequate properties. For example, libraries of natural peptides are limited to the 20 proteinogenic amino acids and are normally restricted to linear peptides that have high flexibility and poor pharmacokinetic properties. The natural amino acid limit makes it difficult (if not impossible) to introduce predefined pharmacophores into peptide libraries. Moreover, the exploration of "chemical space" is limited by the restricted set of residues, many of which share the same or similar side-chain chemical groups. Marked flexibility of the peptide backbone imposes high entropic cost upon peptide ligand binding to the target molecule, resulting in low affinity interactions [182,183]. Linear peptides can also fall prey to enzymatic cleavage by exoproteases [184]. Potent and proteolytically stable binders can be designed by introducing modifications into peptides, such as incorporation of non-proteinogenic

residues [185], cyclization [186,187], or the inclusion of stabilizing moieties (e.g., the parallel β-sheet scaffolds) [188]. Another option is the introduction of chemical post-translational modifications (cPTM) that can vastly expand the diversity of peptide libraries, for example, by phosphorylation [189], conjugation to glycans [190,191], attachment of a fluorescent reporter [192], or ligation to a synthetic peptide fragment [193]. By introducing constrained topologies, cyclic [194] or bicyclic [195] variants can be generated [187]. However, cPTMs are not quantitative. These moieties are also not genetically encoded, and thus cannot be identified via sequencing, although this problem can be overcome with the construction of chemically identical peptide libraries using different codons ("silent barcodes"), signifying a specific cPTM [196].

Genetic code "reprogramming" refers to reassignment of arbitrary codons from proteinogenic to non-proteinogenic amino acids, allowing ribosomal synthesis of non-canonical peptides. This strategy is compatible with acellular in vitro display techniques. By omitting aminoacyl-tRNA synthetases (ARSs), library diversity can be increased by utilizing a reconstituted translation system such as PURE [153]. tRNAs can be charged with nonstandard or artificial amino acids employing natural or modified ARSs [197]. Promiscuous tRNA acylation ribozymes, termed "flexizymes", were created as a surrogate for ARSs. Flexizymes recognize activated carboxylates and allow extensive genetic code reprogramming [198]. Combining flexizymes with a custom in vitro translation system, a new technology, dubbed FIT (flexible in vitro translation), was developed. Adding mRNA display to the mixture, the so-called RaPID (random non-standard peptide integrated discovery) system was born, facilitating the discovery of potent nonstandard peptides for therapeutic and diagnostic use [38].

#### **4. Peptide Library Design and Construction**

*Completely* randomizing even relatively short peptides would require a library size surpassing the capacities of most platforms. Sampling the complete mutational space for peptides exceeding 8–9 residues is therefore practically impossible, and gene diversification strategies only allow for generation and subsequent interrogation of a limited subset of the entire theoretical peptide population. Peptide maturation can be depicted as an ascent in a simplified fitness landscape (Figure 3) in which the x-y coordinates denote the otherwise multidimensional genotype, and the z-axis represent the peptide's "phenotypic" traits, e.g., target affinity. Ascending towards peak activity with mutational steps is the goal of directed evolution. Beneficial mutations accumulate over several generations upon selection pressure, resulting in improved phenotype [199].

In general, library generation can be performed either through focused or random mutagenesis. The latter is usually used in the absence of structure–function relationship knowledge. In focused mutagenesis, residues previously found to be essential for peptide activity are retained (or favored over the rest of the building block set), while the others are (fully or partially) randomized. Of course, the odds that a library contains improved peptide variants are higher for those produced by focused mutagenesis. A plethora of mutagenesis methods can be used for gene diversification in library generation and we will briefly discuss them below.

**Figure 3.** Maturation of a peptide depicted as ascent on a simplified fitness landscape. After each selection round, mutations are introduced into the enriched combinatorial library, and the next generation of peptides is screened for improved affinity and/or activity.

#### *4.1. Random Mutagenesis*

Random mutagenesis based on physical and/or chemical mutagens is sufficient for traditional genome screening (gene inactivation), but it is not suitable for directed evolution due to limited mutational spectrum [200,201]. For library generation purposes, random mutagenesis can be performed in vivo in bacterial mutator strains that contain defective proofreading and repair enzymes (mutS, mutT, and mutD) [202–204]. Another approach in *E. coli* relies on mutagenesis plasmids (MP), which carry multiple genes for proteins affecting DNA proofreading, mismatch repair, translesion synthesis, base selection, and base excision repair, thereby enabling broad mutagenic spectra. MPs support mutation rates 322,000-fold over basal levels and are suitable for platforms based on bacterial and phage-mediated directed evolution [205]. Unfortunately, beside the library gene, mutator strains and MPs also induce deleterious mutations in host genome. In eukaryotes, this was overcome by the development of orthogonal in vivo DNA replication apparatus, which in essence utilizes plasmid–polymerase pairs, limiting mutagenesis to a cytoplasm-only event [206]. Related phenomena are also known to occur in nature (e.g., the *Bordetella bronchiseptica* bacteriophage error-prone retroelement, which selectively introduces mutations into the gene encoding the major tropism determinant (Mtd) protein on the phage tail fibers [207]) and can be exploited for creating libraries [208].

One of the most established methods for in vitro random mutagenesis is the error-prone PCR (epPCR), first described in 1989 [209]. It works by harnessing the natural error rate of low-fidelity DNA polymerases, generating point mutations during PCR amplification. However, even the faulty *Taq* DNA polymerase is not erroneous enough to be useful for constructing combinatorial libraries under standard amplification conditions. The fidelity of the reaction can be further reduced by altering the amount of bivalent cations Mn2<sup>+</sup> and Mg2<sup>+</sup>, introduction of biased concentrations of deoxyribonucleoside triphosphates (dNTPs) [210], using mutagenic dNTP analogues [211], or adjusting elongation time and the number of cycles [212]. Random mutations can also be induced by utilizing 3'-5' proofreading-deficient polymerases [213–215].

Despite its popularity, epPCR suffers from limited mutational spectrum as it inclines to transitions (A↔G or T↔C). Thus, epPCR-generated libraries are abundant in synonymous and conservative nonsynonymous mutations as a result of codon redundancy [199]. Ideally, all four transitions (AT→GC and GC→AT) and eight transversions (AT→TA, AT→CG, GC→CG, and GC→TA) would occur at equal ratios, with the desired probability, and without insertions or deletions [216]. This problem has been addressed by the sequence saturation mutagenesis (SeSaM) [217] method, which utilizes deoxyinosine, a promiscuous base-pairing nucleotide that is enzymatically inserted throughout the target gene and later changed for canonical nucleotides using standard PCR amplification of the mutated template gene. SeSaM was later improved with the introduction of SeSaM-Tv-II [218], which generates sequence space unobtainable via conventional epPCR by increasing the number of transversions. It employs a novel polymerase with increased processivity, allowing efficient read through consecutive base-pair mismatches. EpPCR has been successfully adopted for library generation in various platforms, including phage [219], *E. coli* [220], and ribosome [221] display.

Alternatively, mutagenesis can be achieved by performing isothermal rolling circle amplification (RCA) under error-prone conditions. Using a wild-type sequence as a template, this method is able to generate a random DNA mutant library, which can be directly transformed into *E. coli* without subcloning [222]. RCA was advanced further, coupling it with Kunkel mutagenesis [223] (see below). Termed "selective RCA" (sRCA), it operates by producing plasmids in *ung-* (uracil-DNA-glycosylase deficient) *dut-* (dUTP diphosphatase deficient) *E. coli* strain to introduce non-specific uridylation (dT→dU). After PCR with mutagenic primers, abasic sites are created by the uracil-DNA glycosylase in the uracil-containing template. Only mutagenized products are amplified by RCA, excluding non-mutated background sequences [224].

Although epPCR generates high mutational rates, the sequence space remains mostly untapped [225]. DNA shuffling is touted to be superior to epPCR and oligonucleotide-directed mutagenesis because it does not suffer from the possibility of introducing neutral or non-essential mutations from repeated rounds of mutagenesis [226]. DNA shuffling was the first in vitro recombination method and it involves random fragmentation of a pool of closely related dsDNA sequences and subsequent reassembly of fragments by PCR [227]. Such template switching generates a myriad of new sequences and improves library diversity by mimicking natural sexual recombination [228]. Meyer et al. [225] developed an approach where DNase I creates double-stranded breaks at the regions of interest, followed by denaturation and reannealing at homologous regions. Hybridized fragments then serve as templates and are subjected to repeated PCR rounds to form a whole array of new sequences. Improved methods were developed, eliminating the lengthy DNA fragmentation step. In the "staggered extension process" (StEP), polynucleotide sequences can be diversified through severely-abbreviated annealing/polymerase-catalyzed extension. In each cycle, growing fragments switch between different templates and anneal to them based on sequence complementarity. They then extend further and the cycle is repeated until full-length mosaic sequences are formed [229]. Another ingenious method for creating random customized peptide libraries by Fujishima et al. [230] works by shuffling short DNA blocks with dinucleotide overhangs, enabling efficient and seamless library assembly through a simple ligation process.

Currently, recombination methods are shifting from in vitro to *in vivo*. Taking advantage of the high occurrence of homologous DNA recombination events in *S. cerevisiae*, the "mutagenic organized recombination process by homologous in vivo grouping" (MORPHING) method was developed. MORPHING is a "one-pot" random mutagenesis method allowing construction of libraries with various degrees of diversity. Short DNA segments are produced by epPCR, and subsequently assembled with conserved overlapping gene fragments and the linearized plasmid by in vivo recombination upon transformation into yeast cells [231]. Another technique for assembling linear DNA fragments with homologous ends in *E. coli* is called "in vivo assembly" (IVA). IVA uses PCR amplification with primers designed to substitute, delete, or insert portions of DNA, and to simultaneously append homologous sequences at amplicon ends. Finally, it exploits recA-independent homologous recombination *in vivo*,

greatly simplifying complex cloning operations. Thus, multiple simultaneous modifications (insertions, deletions, point mutations, and/or site-saturation mutagenesis) are confined to a single PCR reaction, and multi-fragment assembly (library construction) proceeds in bacteria following transformation [232].

#### *4.2. Focused Mutagenesis*

Effectively exploring the sequence landscape requires structural and biochemical data (from previous random mutagenesis studies), which can be leveraged to constrain genetic variation to distinct positions of the (poly)peptide, such as regions of the peptide aptamer scaffold which can endure substitutions/insertions/deletions without affecting their overall protein fold, or those peptide residues considered not absolutely essential for specific property of interest (and whose mutation might further augment peptide's activity). Random mutagenesis results in stochastic point mutations at codons corresponding to such residues, but systematically interrogating the entire set of residues at a specific position requires a focused mutagenesis strategy. Focused libraries are typically smaller and more effective, as they only address the residues presumed to bestow the peptide with the property of interest [233].

#### 4.2.1. Enzyme-Based Approaches

Building a library of recombinant DNA constructs is a widely adopted practice accessible to virtually all laboratories, due to the ease of oligonucleotide synthesis and availability of commercial restriction enzymes and DNA ligases. The so-called oligonucleotide-directed mutagenesis enables point or multiple mutations to the target DNA sequence [234]. Normally, a mutagenic primer is designed and synthesized, subsequently elongated by Klenow fragment of DNA polymerase I, ligated into a vector by T4 DNA ligase and finally transformed into a competent *E. coli* strain. This process is long and includes multiple subcloning and ssDNA rescuing steps [235]. Several kits for site-specific mutagenesis based on mutagenic primers are commercially available. One of the systems works by applying a pair of forward and reverse complementary oligonucleotides with designed mutations. The primers are perfectly complementary to the template at 5' and 3' ends, but carry a changed central nucleotide sequence. A high-fidelity *Pfu* DNA polymerase is used to amplify the entire plasmid harboring the gene to be mutated, followed by the removal of the template by *Dpn*I (an endonuclease specific for methylated DNA) [236]. There are numerous adaptations of this method (reviewed by Tee and Wong [216]).

An approach termed "Kunkel mutagenesis" is commonly used for constructing libraries displayed on filamentous phage [237,238], because its genome is circular and single-stranded. In Kunkel mutagenesis, mutations are introduced with a mutagenic primer that is complementary to the circular ssDNA template. The template is propagated in an *ung- dut- E. coli* strain. This enzyme handicap results in the template DNA containing uracil bases in place of thymine. The template is recovered and hybridized with the primer and extended by polymerase, followed by transformation into *ung*<sup>+</sup> *dut*<sup>+</sup> host cells [223]. Upon transformation, uridylated DNA template is biologically inactivated through the action of uracil glycosylase [239] of the *ung*<sup>+</sup> *dut*<sup>+</sup> host, granting a strong selection advantage to the mutated strand(s) over the template.

Overlap extension PCR is another focused mutagenesis approach. First, two DNA fragments with homologous ends (and harboring desired mutation(s)) are amplified in separate PCR reactions by using 5' complementary oligonucleotides. In a subsequent reaction, the fragments are combined; now, the overlapping 3' ends from one of the strands of each fragment anneal and serve as "mega" primers for extension of the complementary strands. Finally, the construct is amplified with the two flanking primers [240]. Based on this strategy, the SLIM (site-directed ligase-independent mutagenesis) method, compatible with all three types of sequence modifications (insertion, deletion, and substitution), employs an inverse PCR amplification of the plasmid-embedded template by two 5' adapter-tailed long forward and reverse primers (which include modifications) and two short forward and reverse primers (identical to the long ones but lacking the 5' adapter sequences) in a single reaction, producing 4 distinct

amplicons. Next, the amplicons are heat denatured and reannealed to yield 16 (hetero)duplexes, 4 of which are directly cloneable, forming circular DNA through ligation-independent pathway via complementary 5' and 3' single-stranded overhangs. All steps of the SLIM procedure are carried out in a single tube [241].

Gibson assembly is a method of combining up to 15 DNA fragments containing 20–40 bp overlaps in a single isothermal reaction. It utilizes a cocktail of three enzymes; exonuclease, DNA polymerase, and DNA ligase. The exonuclease nibbles back DNA form the 5' end, enabling annealing of homologous DNA fragments. DNA polymerase then fills in the gaps, followed by the covalent fragment joining by the DNA ligase [242]. Applications of Gibson assembly include site-directed mutagenesis and library construction [243]. A recent adaptation, QuickLib, is a modified Gibson assembly method that has been used to generate a cyclic peptide library [244]. QuickLib uses two primers that share complementary 5' ends; one long partially degenerate, and the other short non-degenerate, which are then used for full plasmid PCR amplification. Subsequently, a Gibson reaction is performed which circularizes the library of linear plasmids, followed by template elimination by *Dpn*I restriction.

Besides conventional enzymes involved in cumbersome digestion and ligation steps, other enzymes can be utilized for mutagenesis. In nature, lambda exonuclease aids viral DNA recombination. It progressively degrades the 5 -phosphoryl strand of a duplex DNA from 5' to 3', producing ssDNA and (mono)nucleotides [245]. To exploit this property, first, a PCR amplification using template ssDNA and phosphorylated primers with overlapping regions is performed. The PCR product is then treated with lambda exonuclease, generating ssDNA fragments that are subsequently annealed via overlap regions. Afterwards, Klenow fragment is employed to create dsDNA. In this manner, site-specific mutagenesis can be performed using primers that contain degenerate bases [246].

One of the most broadly used approaches for characterization of individual amino acid residues of a (poly)peptide with regards to their contribution to binding affinity or activity is the alanine-scanning mutagenesis. As the name implies, the technique is based on systematic substitution of residues with alanine, and assessing ligand's activity in a biochemical assay. Alanine eliminates the influence of all side chain atoms beyond the beta-carbon, thus exploring the role of side chain functional groups at interrogated positions [247]. For example, a conventional single-site alanine-scanning was used to assess the contribution of individual amino acid residues of a Fc fragment binding peptide displayed on filamentous phage [67]. Since this type of approach is laborious, methods have been developed for multiple alanine substitutions in a high-throughput manner [248]. One such approach builds on the codon-based mutagenesis, analyzing multiple positions, applying split-and-mix synthesis to produce degenerate oligonucleotides (one pool for the alanine codon and another for the wild-type codon) [249]. An alternative to alanine-scanning is serine-scanning, which follows the logic that, sometimes, substitutions with the hydrophobic alanine side chains may be more detrimental to the peptide's affinity compared to the slightly larger but hydrophilic serine side chain. Similarly, homolog-scanning (substitutions at individual positions with similar residues) may be employed with the goal of minimizing structural disruption and identifying residues essential for maintaining a function [250].

Another site-directed mutagenesis type is the cassette mutagenesis. It works by replacing a section of genetic information with an alternative, synthetic sequence—a "cassette" [251]. Different from other approaches that target short regions of a gene, this method is convenient for sequences up to 100 bp in size [252,253]. A prerequisite for this method to be practical is that the gene cassette must be flanked by two restriction sites that are complementary and unique with digest sites on the targeted vector. Restriction enzymes excise the targeted fragment from a vector that can then be replaced with DNA sequences carrying desired mutations. If a larger fragment is to be cloned, the "megaprimer" approach is applied by amplification with a series of oligonucleotides [254]. This method can also benefit from using "spiked" synthetic oligonucleotides, allowing randomization at multiple sites [255,256]. Cassette mutagenesis is based on Kunkel mutagenesis, which is time-consuming, so researchers developed an improved version termed ''PFunkel", a conflation of *Pfu* DNA polymerase and Kunkel

mutagenesis, that can be performed in a day's work [257,258]. To overcome the main constraint of site-directed mutagenesis, which is the tedious primer design, rational design techniques can be utilized to introduce desired mutations at precise positions. Researchers can leverage readily available tools such as AAscan, PCRdesign, and MutantChecker to simplify and boost the mutagenesis process [259].

#### 4.2.2. Chemical-Based Mutagenesis

Chemical-based mutations involve various chemical methods to produce desired mutants. To chemically synthesize fully randomized oligonucleotides, a mixture of nucleotides must be applied at each coupling step [260]. A calamitous problem with this strategy is the pronounced bias resulting from the uneven incorporation frequency of the 4 nucleotide building blocks due to their inherent reactivity differences, rendering statistical random mutations inaccessible. Avoiding incorporation of stop codons is practically unattainable and the system is inclined towards amino acid residues encoded by redundant codons [261]. This problem can be tackled by adjusting the mutational frequency with "spiked oligonucleotides" [255], taking into account the differences in reactivity of mononucleotides and the redundant genetic code. The essence of DNA spiking is that non-equimolar ratio of bases at targeted positions are applied during oligonucleotide synthesis, meaning each wild type nucleotide can be custom "doped", achieving either "soft" (high incidence of a certain nucleotide) or "hard" (equal incidence of all four nucleotides) randomization, manually tuning the occurrence of certain amino acids at defined positions in the (poly)peptide chain.

Site-saturation mutagenesis seeks to achieve mutation at a maximal capacity by examining substitutions of a given residue against all possible amino acids. A fully randomized codon NNN (where N = A/C/G/T) gives rise to all possible 64 variant combinations (also known as 64-fold degeneracy) and codes for all 20 amino acids and 3 stop codons. This causes difficulties during library screening and risks enrichment of non-functional clones due to the random introduction of termination codons [262]. Operating with NNK, NNS, and NNB codons (where K = G/T, S = C/G, and B = C/G/T) minimizes the degeneracy in the third position of each codon, consequently lowering codon redundancy and the frequency of terminations [263]. However, such degenerate primers are expensive to synthesize, and using a single degenerate primer to completely eliminate codon redundancy while providing all 20 amino acids is unattainable, due to disproportional representation of certain amino acids [264,265]. Other strategies have to be employed to circumvent these constraints.

To synthesize redundancy-free mutagenic primers, mono [266], di [267], or trinucleotide phosphoramidite [268] solutions (or combinations [269]) can be used. This way, mixtures of oligonucleotides encoding all possible amino acid substitutions within a defined stretch of peptide or a limited number of amino acids (i.e., "tailored" randomization) can be synthesized. This fine-tuning gives complete control over amino acid prevalence at defined positions in the corresponding (poly)peptide sequence, achieving "soft" or "hard" randomization. With this approach, codon redundancy and stop codons are completely eliminated [261]. Another randomization strategy labeled MAX eliminates genetic redundancy by using a collection of 20 primers containing only codons for each amino acid with the highest expression frequency in *E. coli* [270]. These primers are annealed to a template strand with completely randomized codons (NNN or NNK) at the targeted position. Any misannealing is trivial, since only the ligated selection strand is amplified by a subsequent PCR. The produced random cassettes are then enzyme-digested for cloning. Further development of this strategy gave birth to an upgraded version dubbed ProxiMAX in which multiple contiguous codons are randomized in a non-degenerate manner [271]. Here, a donor blunt-end dsDNA with terminal MAX codons and an upstream *Mly*I restriction site is ligated to an acceptor blunt-end dsDNA. The product strands are amplified, analyzed, and combined at desired ratios in the next randomization cycle. After each ligation cycle, endonuclease *Mly*I is applied to remove the donor DNA strand, making only the randomized sequences available for the successive ligation cycle.

Another strategy that has been developed by Tang et al. [264] is cost-effective and uses degenerate codons to eliminate or achieve near-zero redundancy. A mixture of four codons, NDT, VMA, ATG, and TGG (where D = A/G/T, V = A/C/G, M = A/C) with a molar ratio of 12:6:1:1 at each randomized position results in an equal theoretical distribution for each of the 20 amino acids, without occurrences of stop codons. Following a similar rationale, Kille et al. [265] developed the ''22c-trick" which uses only three codons per randomized position; NDT, VHG, and TGG (where H = A/C/T), at 12:9:1 molar ratio. The name sprung from the usage of 22 unique codons, achieving near uniform amino acid distribution (i.e., 2/22 for Leu and Val, and 1/22 for each of the remaining 18 amino acids). Other sophisticated primer mixing strategies have been reported [272–274], although picking the best approach is mostly dependent on the size and quality of the library to be prepared, and the lab's operating budget [275].

#### **5. Conclusions**

"Design, build, test, repeat" is the core philosophy of synthetic biology. Considering the intricacies of biological systems, every step of this process is potentially affected by multiple obstacles. Researchers are often put in predicaments where their progress is halted by unpredictable issues, and solving them can last even months at a time. These are not sporadic isolated events, but rather frequent and occur in virtually every field in natural sciences, let alone molecular biology. Building a peptide library is no different; it involves miscellaneous techniques and is often laborious. Recent advances are trumping bottlenecks in practically every facet of this technology. Today, new technologies enable accurately modeling peptide structure as long as 40 residues [276], and in silico design and screening using bioinformatics tools like Rosetta [277]. Furthermore, the ability to synthesize oligonucleotides adequate in size to code for potential peptide binders and assemble them in a display format of choice is now available even to labs with tight resources [275]. Alternatively, commercial peptide libraries can be purchased, which is very convenient especially for small operations with limited personnel, equipment, and know-how. Owing to automation, unprecedented parallel screening ability is now a reality, and coupled with high-throughput deep DNA sequencing [31], discovery of large numbers of novel high-affinity binders is a realistic prospect.

So, what predictions can we make for the future of peptide discovery? Surely, the convergence of new molecular strategies coupled with novel high-throughput methods and machine learning [278] will aid future bench researchers with engineering novel peptide binders. Initiating such projects could one day be as simple as running a computer program, and ordering (or synthesizing) a library—an effortless endeavor.

**Author Contributions:** Conceptualization: K.B and T.B.; Writing—Original Draft: K.B.; Writing—Review and editing: K.B. and T.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by Slovenian research Agency, program P4-0127.

**Acknowledgments:** The authors are grateful to Stella Ivšek for contributing the artwork for this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**



#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Accumulation of Innate Amyloid Beta Peptide in Glioblastoma Tumors**

#### **Lilia Y. Kucheryavykh 1, Jescelica Ortiz-Rivera 1, Yuriy V. Kucheryavykh 1, Astrid Zayas-Santiago 2, Amanda Diaz-Garcia <sup>2</sup> and Mikhail Y. Inyushin 2,\***


Received: 8 April 2019; Accepted: 15 May 2019; Published: 20 May 2019

**Abstract:** Immunostaining with specific antibodies has shown that innate amyloid beta (Aβ) is accumulated naturally in glioma tumors and nearby blood vessels in a mouse model of glioma. In immunofluorescence images, Aβ peptide coincides with glioma cells, and enzyme-linked immunosorbent assay (ELISA) have shown that Aβ peptide is enriched in the membrane protein fraction of tumor cells. ELISAs have also confirmed that the Aβ(1–40) peptide is enriched in glioma tumor areas relative to healthy brain areas. Thioflavin staining revealed that at least some amyloid is present in glioma tumors in aggregated forms. We may suggest that the presence of aggregated amyloid in glioma tumors together with the presence of Aβ immunofluorescence coinciding with glioma cells and the nearby vasculature imply that the source of Aβ peptides in glioma can be systemic Aβ from blood vessels, but this question remains unresolved and needs additional studies.

**Keywords:** amyloid; Aβ peptide; glioma; platelets

#### **1. Introduction**

As Alzheimer's disease (AD) affects mostly the elderly population [1], gliablastoma (GBM) is the most common primary malignant brain tumor in older people [2]. Recently, statistically independent cohort studies have found an inverse association between cancers in general and AD [3–5]. Specifically, most patients with AD are protected from lung cancers [3], and, vice versa, cancer survivors have a lower risk of AD [6]. However, there is a significant positive correlation between the AD mortality rate and the malignant brain tumor mortality rate [4,7,8]. These correlations suggest that there are common factors in these diseases. Mitochondrial metabolism, in general, and the p53, Pin1, and Wnt cellular signaling pathways, in particular, were proposed as possible linkages in this cancer–AD relationship [9,10]. Interestingly, chemotherapy [6] and radiotherapy [9] also affected this correlation.

On the other hand, the buildup of amyloid precursor protein (APP), the precursor of the AD hallmark amyloid beta (Aβ) peptides, have now been found in pancreatic and breast cancer tumors and the corresponding metastatic lymph nodes [11,12]. Proteolytic cleavage of APP by the α-secretase pathway mediates proliferation and migration in breast cancer, while other pathways were not studied [13]. It was also discovered that plasma levels of Aβ peptides in esophageal cancer, colorectal cancer, hepatic cancer, and lung cancer patients were significantly higher than in normal controls [14]. The question arises, what is the source of these Aβ peptides? Moreover, what is their role?

Aβ peptides can be generated by glioma cells themselves. It was shown that glioma cells in culture produce the 4-kDa Aβ peptide, which co-migrates with synthetic Aβ(1–40) (also known as Aβ40) and is specifically recognized by antibodies raised against terminal domains of the Aβ peptide, and releases them into the medium [15]. The role of Aβ peptides in glioma development was investigated in another study [16]. It was reported that full-length Aβ40 is a dose-dependent inhibitor of angiogenesis and suppresses human U87 glioblastoma subcutaneous xenografts in nude mice. A small peptide sequence of Aβ, Aβ(11–20), was found to be a potent, anti-angiogenic molecule. Systemic delivery of this peptide leads to reductions in glioma proliferation, angiogenesis, and invasiveness [16]. Furthermore, parallel experiments in transgenic mice overexpressing Aβ40 also showed reductions in glioma growth, invasion, and angiogenesis [16–18].

However, besides glioma itself, there is another systemic source of Aβ peptide production in the body [19,20]. Recently, we showed that platelets produce a massive release of Aβ after thrombosis in the brain and skin and that this release is concentrated near blood vessels [21,22]. It has been shown that platelets are hyperactivated in cancer patients and form cancer cell-induced aggregates and micro-thrombi in the vasculature near tumors (reviewed in [23]). A high platelet count is associated with poor survival in a large variety of cancers, while thrombocytopenia or antiplatelet drugs can reduce the short-term risk of cancer, cancer mortality, and metastasis (reviewed in [24]). Platelets affect glioma cells by releasing platelet-derived growth factor (PDGF) [25]. May platelet-generated Aβ also diffuse to glioma cells and accumulate inside these brain tumors?

In our study, we chose specific antibodies against Aβ peptides with low reactivity for the precursor APP to see whether Aβ immunoreactivity is present in glioma tumors and nearby blood vessels in mice. We used an enzyme-linked immunosorbent assay (ELISA) to study Aβ40 content in tumor and "healthy" brain area, while also assessing Aβ40 content in the membrane and cytoplasmic fractions of glioma cells. The presence of aggregated forms of amyloid inside glioma tumors was evaluated as well.

#### **2. Results**

#### *2.1. Immunoreactivity against A*β *Peptides Is Present in Glioma Cells in Primary and Secondary Tumors as Well as in Blood Vessels and Erythrocytes in the Near Vicinity, Indicating that the A*β *Level Is Elevated in the Tumor Zone*

After glioma implantation into mouse brains using standard methods established in our laboratory [26,27] we allowed 16 days of tumor growth. We then prepared brain slices containing tumors within nearby tissue. Immunostaining with polyclonal (Figure 1A,B, green) antibody against Aβ showed that these peptides are present in glioma cells (white arrows), in nearby broken blood vessels, and in escaped erythrocytes. In addition, astrocytes are marked by red fluorescence (anti-Glial Fibrillary Acidic Protein (anti-GFAP)), and the nuclei are marked blue (4 ,6-diamidino-2-phenylindole (DAPI) staining). The same images (Figure 1A,B) are presented as moving confocal images (Figure S1A,B) so that blood vessel details and their relation to glioma cells are more discernable. Inside blood vessel segments marked by Aβ green immunofluorescence, erythrocytes were also specifically marked by Aβ (Figure 1A,B, see also Figure S1A,B) as well as erythrocytes diffused locally near broken blood vessels (Figure S1A,B), as blood vessels near the tumor are usually ruptured [28]. As was shown previously, Aβ peptide in blood plasma binds to practically all erythrocytes and may be a marker for AD [29]. Also, the addition of synthetic Aβ specifically marks erythrocyte membranes [30]. We want to stress once again that Aβ immunofluorescence is present only in blood vessel segments near the glioma tumor and in the tumor itself (Figure 1A,B and Figure S1A,B). Therefore, only the glioma cells in the tumor and nearby blood vessels containing erythrocytes and within the distance 0–200 μm from the ruptured blood vessel are fluorescent.

**Figure 1.** Aβ peptide immunoreactivity (green) in glioma cells and in nearby blood vessels. (**A**) A small glioma tumor near a broken blood vessel. Aβ peptide immunoreactivity (green) visible in glioma cells (white arrow) and in blood vessels. Erythrocytes released from the broken vessel are also marked with Aβ-related immunofluorescence (yellow arrow). (**B**) A larger glioma tumor in which a broken blood vessel passes through the tumor (more clearly visible in the 3D image of this tumor shown in Figure S1B), and white arrows indicate glioma cells marked by green immunofluorescence representing Aβ peptide. For **A** and **B**, astrocytes are indicated by immunoreactivity to GFAP (red) and cell nuclei by DAPI staining (blue). Scale bar, 20 μm. (See also supplemental confocal 3D images of the same tumors in Figure S1A,B, respectively).

We also made ELISA measurements of mouse Aβ40 peptide in the brain sample tissue containing the main tumor versus the "healthy" control from the corresponding cortical zone in the other hemisphere from the same animal 16 days after glioma implantation. Similar amounts of the homogenate were taken for analysis. It was found that the relative amount of Aβ40 in the glioma tissue was 142 ± 9% larger and statistically different (*p* < 0.001; *t* = 4.714; d*f* = 4; *n* = 3) from "healthy" tissue (Figure 2A).

In these experiments, we found that glioma cells exhibit specific Aβ immunofluorescence that clearly marks these cells, but the question arises whether it is inside the cells or somehow attached to the external membrane.

#### *2.2. A*β*40 Is Concentrated in the Membrane Cell Fraction in Glioma Tumor Tissue*

To determine more precisely there Aβ is distributed, we separated the cytoplasmic and membrane fractions of proteins from glioma cells from the main tumor extracted from the brain of animals 16 days after implantation. Before processing, blood cells were eliminated from the tumor tissue samples using the Percoll purification method. Membrane and cytoplasmic proteins were isolated, and the total protein content was determined using the Bradford spectrophotometric method to establish a reference point for measuring the amount of Aβ in each fraction. Using ELISA, it was found that the relative amount of Aβ40 in the membrane fraction is significantly greater (170 ± 4%, *p* < 0.001, *t* = 16.23, d*f* = 4, *n* = 3) than in the cytoplasmic fraction (Figure 2B).

**Figure 2.** (**A**) The relative amount of Aβ40 in the glioma tissue is elevated. (**B**) Aβ40 in glioma tumor tissue is concentrated in the cell membrane fraction.

#### *2.3. Glioma Tumor Tissue Contains Aggregated Amyloid*

To determine whether glioma tumors have aggregated forms of Aβ with cross-β architecture, we used standard thioflavin T and thioflavin S staining of brain slices with glioma from animals with implanted glioma cells. It was previously demonstrated that both thioflavin T and thioflavin S fluorescence originates mainly from dye bound to aggregated forms of amyloids with cross-β-pleated sheet structure, and gives a distinct increase (and a spectral shift in the case of thioflavin T) in fluorescence emission after binding [31,32]. We used IP injection of thioflavin T, while slices containing tumors were additionally stained with thioflavin S. Both dyes specifically marked glioma tumors (Figure 3), in which staining (green for thioflavin T and red for thioflavin S) is obvious only inside the tumor body, while the nearby normal tissue remained unstained.

**Figure 3.** Aggregated amyloid visualized by staining with thioflavin T (green) and thioflavin S (red) inside the glioma tumor body. The white arrow shows the glioma tumor body visible in the brain slice.

#### **3. Discussion**

Here we report that antibodies against Aβ with relatively low reactivity against APP [33] show Aβ immunostaining in glioma cells and nearby blood vessels in mice (Figure 1). Using ELISA, we also report that Aβ40 levels are significantly increased in glioma (Figure 2). Glioma tissue from one brain hemisphere contains about two-fold more Aβ than a similar amount of tissue from the "mirror" hemisphere, with Aβ concentrated in the membrane fraction. The question arises whether Aβ is coming from the systemic source—from the blood, and is marking the glioma cell membrane—or is synthetized by glioma cells themselves.

Previous studies support the possibility of systemic source for this Aβ. The results indicating increased Aβ content in blood plasma for different types of cancer have already been reported [14]. Systemic Aβ is generated in large quantities by blood platelets in broken vessels, as we have shown for the thrombotic process [21,22]. Here, broken blood vessels marked by extensive Aβ fluorescence can be seen near tumors in our experiments (Figure 1A,B and Figure S1A,B). It has been shown previously that platelets are hyperactivated in cancer patients and form cancer cell-induced aggregates and micro-thrombi in vasculature near tumors (reviewed in [23]), thus suggesting that the source of Aβ that we have found for the clotting process may also be present here. It seems possible that Aβ released from clots can migrate and somehow mark only glioma cells (Figure 1A,B), but this raises new questions about why Aβ marks glioma cells so specifically.

To bind specifically, Aβ must be recognized by a specific receptor on the external membrane of the glioma cell. A known specific Aβ receptor, such as the PrPC–mGluR5 complex, is associated with proline-rich tyrosine kinase 2 (Pyk2 or PTK2B) [34,35]. This receptor localizes to postsynaptic sites in the brain, but is also overexpressed in all glioblastoma cells, where it controls cell migration [27,36]. Aβ is a known inhibitor of Pyk2 [35]. Thus, its release by platelets may be a part of the intrinsic immunity that is directed against cancerous gliomas. Another suspected molecule related to Aβ binding is PI3K (phosphatidylinositol [PI] type 3 receptor tyrosine kinase). This kinase and its signaling network is also present and hyperactivated in a majority of glioblastoma cells, where it controls membrane microdynamics and cell cycling [37,38]. Its Aβ receptor is unknown, but it complexes with PI3K and most probably is situated on the external membrane [39,40]. It is known that Aβ inhibits PI3K activity as well [41]. We speculate that in this case, Aβ peptides generated by platelets also play a role in the intrinsic immunity directed against cancerous gliomas.

In addition, Aβ may bind to the advanced glycation end products (RAGE) receptor. It is known that this receptor is the binding site for Aβ peptides [42] thus mediating Aβ transport through the blood–brain barrier [43]. Very same RAGE receptor regulates the tumor environment and tumor cell migration, is part of the important microglial activation mechanism and is overexpressed in tumors [44].

On the other hand, it was shown that glioma cells in culture produce Aβ peptides that comigrate with synthetic Aβ40 and are specifically recognized by antibodies raised against the terminal domains of the Aβ peptide and released by these cells into the medium [15]. However, cultured and in vivo astrocytes also produce Aβ peptides is similar amounts [45–47] and astrocytes were not marked by Aβ immunofluorescence in our experiments, probably because these peptides is present in/near the astrocytes in amounts that can be neglected compared with the glioma tumor cells that we have studied here. While derived from the same cell type, glioma cells are clearly marked by Aβ immunofluorescence in our experiments (Figure 1).

It is clear that the question of whether the source of Aβ is inside the glioma cell itself or is a systemic source from blood vessels should be investigated further. Anyway, all our results from these experiments taken together as well as our previous experience with Aβ peptides released during platelet accumulation and aggregation in thrombotic blood vessels [21,22] lead us to the conclusion that most probably Aβ peptides are generated by platelets and somehow bind almost exclusively to glioma cells.

An additional issue is the accuracy of Aβ40 concentrations measurements in brain tissue. In our study of Aβ40 concentrations in tissue, we used relative values, indicating the percentage change from initial values, as the most accurate. It was shown previously that the Invitrogen Aβ40 ELISA Kit is very specific to murine Aβ40, but the data are very sensitive to "noise" (such as the presence of other proteins and lipids), and absolute values can deviate 40–50% [48]. Also, ELISA data may vary considerably, with a variety of collection and storage protocols [49]. Measurement of Aβ by ELISA reveals mainly free peptides, while a significant amount of Aβ peptide remains bound to proteins, lipoproteins, and cell membranes [50].

Our experiments also indicate that there is some thioflavin-positive amyloid inside glioma tumors (Figure 3). While we have shown that Aβ peptides are definitely present in tumor and may constitute a predominant part of this glioma amyloid, the specific type of aggregated amyloid found inside the borders of glioma tumors is unknown. To our opinion, this amyloid is most probably mixed amyloid, as was found for AD [51]. Protein aggregation is sequence specific, not favoring self-assembly over cross-seeding with nonhomologous sequences [52]. However, proteins with aggregation-prone regions may aggregate with each other at elevated concentrations, forming a mixed misfolded amyloid [53]. In this case, one aggregated protein can work as a "seed" for aggregation of other protein types. Previously, different amyloids were found in a variety of tumors. Different carcinomas have amyloid stroma [54,55], and odontogenic tumors are positive for thioflavin T and Congo Red staining and are also immunopositive for the enamel matrix protein ameloblastin [56–58]. Similarly, amyloid was reported in breast cancer tumors but was determined to be a localized amyloid light chain (AL) type (primary amyloidosis caused by ImG light-chain β-sheeting) [59,60]. Localized AL type amyloidosis was also found in myeloma (plasma cell) tumors as well as in kidneys and early-stage non-small-cell lung adenocarcinomas [61]. If the content of amyloid in glioma tumors is mixed, it must be further studied, because tumor-related amyloid could be a new target for anticancer therapy.

#### **4. Materials and Methods**

#### *4.1. Ethics Statement*

All procedures involving rodents were conducted in accordance with the National Institutes of Health regulations concerning the use and care of experimental animals and approved by the Universidad Central del Caribe Institutional Animal Care and Use Committee. All efforts were made to minimize suffering. In all surgical experiments, animals were anesthetized with isoflurane (4% for induction and 1.75% for maintenance) using a Matrix Quanti-flex VMC Anesthesia Machine for small animals (Midmark Corporation, Dayton, OH, USA). The animals were sacrificed for brain tissue and blood analysis after experiments.

#### *4.2. Glioma Cell Culture*

The GL261 glioma cell line derived from C57BL/6 mice was obtained from the NCI (Frederick, MD, USA). All cells were cultured in Dulbecco's Modified Eagle's Medium (DMEM) supplemented with 10% fetal calf serum, 0.2 mM glutamine, and antibiotics (50 U/mL penicillin, 50 μG/mL streptomycin) and maintained in a humidified atmosphere of CO2/air (5%/95%) at 37 ◦C. The medium was exchanged with fresh culture medium every 2–3 days.

#### *4.3. Intracranial Implantation of Glioma Cells*

All surgery was performed under isoflurane anesthesia, and all efforts were made to minimize suffering. GL261 glioma cells were implanted into the right cerebral hemisphere of 12–16-week-old C57BL/6 mice. Implantation was performed according to the protocol that we described earlier [26]. Briefly, mice were anesthetized with isoflurane, and a midline incision was made on the scalp. At stereotaxic coordinates of bregma, 2 mm lateral, 1 mm caudal, and 3 mm ventral, a small burr hole (0.5 mm diameter) was drilled into the skull. One microliter of cell suspension (2 <sup>×</sup> 104 cells/μL in phosphate buffer solution (PBS)) was delivered at a depth of 3 mm over 2 min. Sixteen days following injection, the animals were anesthetized with pentobarbital (50 mg/kg) and transcardially perfused with PBS followed by 4% paraformaldehyde (PFA). The brains were removed and post-fixed in 4% PFA/PBS for 24 h at 4 ◦C, followed by 0.15 M, 0.5 M, and 0.8 M sucrose at 4 ◦C until fully dehydrated. The brains were then frozen and embedded in Cryo-M-Bed embedding compound (Bright Instrument, Huntingdon, UK) and cut using a Vibratome UltraPro 5000 cryostat (American Instrument, Haverhill, MA, USA).

#### *4.4. Percoll Purification of Blood Cells from Tissue Samples for Membrane Fraction Isolation*

To study Aβ distribution inside tumor cells, we first eliminated blood cells from the tumor tissue sample using the Percoll purification method. Tumors and healthy cortex from the contralateral hemisphere were removed from the mouse brains, minced into 1–2-mm pieces with a razor blade, and enzymatically homogenized using a collagenase/hyaluronidase in DMEM (cat. #07912, Stemcell Technologies, WA, USA). Blood cells were separated from the homogenized tissue using Percoll (Sigma-Aldrich, St. Louis, MO, USA) gradients of 30% and 70%. Following this procedure the tissue fraction free from blood cells was collected from the top of the 70% Percoll level and used for further analysis.

#### *4.5. Isolation of Membrane and Cytoplasmic Proteins*

A homogenized cell suspension was resuspended and sonicated in 20 mM Tris buffer containing 1 mM ethylenedinitrilotetraacetic acid (EDTA), 1 mM β-mercaptoethanol, and 5% glycerin, pH 8.5 with HCl, 1 μM Na3VO4, 0.5 mM phenylmethylsulfonyl fluoride (PMSF), and 10 mM dithiothreitol (DTT). After centrifugation the supernatant was collected and used for further investigations as the cytoplasmic protein fraction. The pellet containing the membranes and the membrane proteins was lysed, and clarified cell lysate was used as the membrane protein fraction.

#### *4.6. Enzyme-Linked Immunosorbent Assay (ELISA) Measurements*

A specialized, ready-to-use, mouse-specific, solid-phase sandwich ELISA kit (cat. #KMB3481; Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA) was used for direct measurement of the amount of Aβ40 peptide in the brains of experimental animals in accordance with the manufacturer's documentation. Briefly, the brain samples were homogenized mechanically, and 100 mg of homogenate was then lysed in guanidine solution (5 M guanidine HCl, 50 mM Tris HCl, pH 8.0). In other experiments, the lysate (normalized to total protein content) from membrane and cytoplasmic fractions (see above) were used. A monoclonal antibody against the NH2-terminus of mouse Aβ40 peptide was coated onto the wells of the microtiter strips provided in the kit. Samples, including standards of known Aβ40 content for calibration purposes as well as experimental specimens, were pipetted into the wells. After washing, the rabbit antibody specific to the COOH-terminus of Aβ40 was added and detected with horseradish peroxidase-labeled anti-rabbit antibody. The optical density values at 450 nm were determined using a Wallac 1420 Victor 2 Microplate Reader (PerkinElmer Inc., Waltham, MA, USA). The calculated mean reading from the healthy hemisphere (normalized cytoplasmic fraction) was defined as 100%, while other readings were presented as the percentage of this value.

#### *4.7. Immunohistochemistry and Confocal Microscopy*

Immunostaining was performed using a protocol previously established in our laboratory [22,62]. Frozen 30-μm sections were generated from brain cortex containing the tumor(s). The sections were blocked with 5% normal goat serum/5% normal horse serum (Vector Laboratories, Burlingame, CA, USA) in 0.10 M phosphate buffer solution (PBS: NaCl, 137 mM; KCl, 2.70 mM; Na2HPO4, 10.14 mM; KH2PO4, 1.77 mM) containing 0.3% Triton X-100 and 0.05% phenylhydrazine for 60 min for permeabilization and then processed separately using two different antibodies against Aβ. For that purpose, slices were incubated with a rabbit polyclonal antibody to Aβ (Abcam, Cambridge, MA, USA, cat. #ab2539) diluted 1:400 in 0.03% Triton X-100, 1% dimethyl sulfoxide (DMSO), 2% bovine serum albumin (BSA), 5% normal horse serum, and 5% normal goat serum in 0.1 M PBS. Anti-GFAP–Cy3 (1:200) was added, and the slice left overnight at 4 ◦C. After three washes with permeabilization solution for 10 min, the secondary antibodies (fluorescein-labeled goat anti-rabbit IgG) were added at a dilution of 1:200 with shaking for 2 h at room temperature and protected from light. The slices were then washed three times with PBS for 10 min and once with distilled water before being transferred onto a glass slide containing Fluoroshield mounting medium (Sigma-Aldrich, St. Louis, MO, USA, cat. #F6057) with DAPI. Negative controls were routinely performed by removal of primary antibody in each staining experiment to validate the immunohistochemical staining quality and results.

For thioflavin (Th) staining we used: (1) ThT staining, in which mice were injected IP with 10 μL/g of 3 mM solution of ThT in PBS. After 5 min, the animals were euthanized, and the brains were harvested and kept in fixative without light. (2) ThS staining, in which brain slices (30 μm) containing tumors were allowed to completely air dry prior to staining, then stained with a drop of 3 mM ThS in PBS (previously filtered through a 0.2-μm filter) for 5 min, then washed twice with distilled water and dried again. The coverslip was mounted with a drop of Vaseline on the slice. DAPI and Cy3 excitation/emission filters were used to visualize ThT and ThS fluorescence, respectively.

Images were acquired using an Olympus Fluoview FV1000 scanning inverted confocal microscope system equipped with a 20×, 40×, or 60×/1.43 oil objective (Olympus, Melville, NY, USA). The images were analyzed using ImageJ software (http://imagej.nih.gov/ij) with the Open Microscopy Environment Bio-Formats library and plugin, allowing for the opening of Olympus files (http://www.openmicroscopy. org/site/support/bio-formats5.4/). The data were evaluated using custom colorization.

#### *4.8. Statistics and Measurements*

Using GraphPad Prism 7.03 (GraphPad Software, Inc., La Jolla, CA, USA) for calculations, an unpaired *t*-test was employed to estimate statistical differences. Values were determined to be significantly different if the two-tailed *p*-value was <0.05.

#### **5. Conclusions**


**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/10/ 2482/s1.

**Author Contributions:** Conceptualization, methodology, original draft preparation, review and editing, and formal analysis were performed by M.Y.I. and L.Y.K.; experimental investigation, formal analysis, visualization, data curation, and review and editing were performed by L.Y.K., A.Z.-S., A.D.-G., J.-O.-R., and Y.V.K.

**Funding:** This research was supported by NIH grants SC1GM122691 to L.Y.K. and SC2GM111149 to M.Y.I. The funding sources had no role in study design; data collection, analysis, or interpretation; or the decision to submit this article.

**Acknowledgments:** We want to thank personnel of Animal Resources Center in Universidad Central del Caribe for their kind help.

**Conflicts of Interest:** The authors declare that they have no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Hidden Aggregation Hot-Spots on Human Apolipoprotein E: A Structural Study**

**Paraskevi L. Tsiolaki** †**, Aikaterini D. Katsafana** †**, Fotis A. Baltoumas, Nikolaos N. Louros and Vassiliki A. Iconomidou \***

Section of Cell Biology and Biophysics, Department of Biology, National and Kapodistrian University of Athens, Panepistimiopolis, Athens 15701, Greece; etsiolaki@biol.uoa.gr (P.L.T.); k.katsafana@gmail.com (A.D.K.); fbaltoumas@biol.uoa.gr (F.A.B.); nlouros@biol.uoa.gr (N.N.L.)

**\*** Correspondence: veconom@biol.uoa.gr; Tel.: +30-210-7274871; Fax: +30-210-7274254

† These authors contributed equally to this work.

Received: 10 April 2019; Accepted: 6 May 2019; Published: 8 May 2019

**Abstract:** Human apolipoprotein E (apoE) is a major component of lipoprotein particles, and under physiological conditions, is involved in plasma cholesterol transport. Human apolipoprotein E found in three isoforms (E2; E3; E4) is a member of a family of apolipoproteins that under pathological conditions are detected in extracellular amyloid depositions in several amyloidoses. Interestingly, the lipid-free apoE form has been shown to be co-localized with the amyloidogenic Aβ peptide in amyloid plaques in Alzheimer's disease, whereas in particular, the apoE4 isoform is a crucial risk factor for late-onset Alzheimer's disease. Evidence at the experimental level proves that apoE self-assembles into amyloid fibrilsin vitro, although the misfolding mechanism has not been clarified yet. Here, we explored the mechanistic insights of apoE misfolding by testing short apoE stretches predicted as amyloidogenic determinants by AMYLPRED, and we computationally investigated the dynamics of apoE and an apoE–Aβ complex. Our in vitro biophysical results prove that apoE peptide–analogues may act as the driving force needed to trigger apoE aggregation and are supported by the computational apoE outcome. Additional computational work concerning the apoE–Aβ complex also designates apoE amyloidogenic regions as important binding sites for oligomeric Aβ; taking an important step forward in the field of Alzheimer's anti-aggregation drug development.

**Keywords:** apolipoprotein E; amyloid fibrils; Alzheimer's disease; Aβ oligomer

#### **1. Introduction**

Human mature apolipoprotein E (apoE) is a 299 amino acid glycoprotein [1,2], taking part in most lipoprotein classes, such as chylomicrons, very low-density lipoproteins (VLDL) and high-density lipoproteins (HDL) [3]. It is a member of an apolipoprotein family, along with apoA-I, apoA-II, apoA-IV, ApoC-I, apoC-II, and apoC-III [4,5]. Each apolipoprotein class has distinct functions and participates actively in the formation of specific lipoprotein scaffolds [6]. Human mature apolipoprotein E is primarily synthesized in the liver, where it is found in higher quantities, but it is also a protein of the brain and other tissues [7]. The functional form of the protein is involved in metabolic pathways that are related to plasma cholesterol and triglyceride transport and distribution among the tissues, by interacting with members of the low-density lipoprotein receptor (LDLR) superfamily [8–11].

The *APOE* gene [12], co-localized with the *APOC1* [12,13] and *APOC2* genes [14–16], has three alleles; *APOE2*, *APOE3* and *APOE*4 [17,18]. Each allele exhibits distinct frequencies among the human population, with *APOE3* having the highest (approximately 78%) [19–21]. The expression of these alleles results in three main forms of the protein, namely, apoE2, apoE3, and apoE4. Interestingly, the apoE4 isoform is of great importance, since it is reported to be involved in both hereditary and sporadic types of the Alzheimer's disease (AD) [22–24]. The differences among the three forms are restricted in the positions 112 and 158 of the mature polypeptide chain. More specifically, in apoE2, cysteines are located in both positions, whereas in apoE4 there is an arginine in both positions. In apoE3, on the other hand, there is a cysteine in position 112 and an arginine in position 158 [25].

Apolipoprotein E is found in both lipid-bound and lipid-free forms. Lipid-free species are relatively rare and are possibly the result of transient dissociation events during the lipoprotein creation [26–31]. It has not been yet possible for any lipid-free form of apoE to be crystallized in the monomeric form, due to its tendency to assemble in tetramers or octamers [32]. A nuclear magnetic resonance (NMR) structure, with the addition of several mutations, successfully determined the three-dimensional conformation of an apoE lipid-free monomer [33]. According to the model, supported by the experimental outcome of the NMR structure, apoE has three structural domains: the N-terminal domain (Figure 1a, green), the C-terminal domain (Figure 1a, blue), and the hinge domain (Figure 1a, red). The monomer connectivity includes the association of the N-terminal domain (residues 1–167) [34,35] with the C-terminal domain (residues 206–299) [36] through a short interim hinge domain (residues 168–205) [33]. Part of the N-terminal domain adopts a four-helix bundle conformation, which is proposed to be the domain buried in the interior of the lipid-free particle [33] (Figure 1a, green).

**Figure 1.** Native nuclear magnetic resonance (NMR) structure of human mature apolipoprotein E (apoE) [33] and apoE amyloidogenic profile by AMYLPRED [37]. (**a**) Different colors show all three structural domains of the apoE3 in solution: the N-terminal domain (green);the C-terminal domain (blue); and the hinge domain (red). Colored regions in orange illustrate "aggregation-prone" segments

132ERLVR136 and 158RLAVY162, respectively, both located on the 4th helix of the four-helix bundle. (**b**) Amyloid propensity apoE histogram represents a weak overall amyloidogenicity, since only two segments exceed the consensus AMYLPRED threshold (regions 132ERLVR136 and 158RLAVY162). Color scheme follows the rules described in (**a**).

Lipid-free apolipoproteins related to apoE are implicated with several amyloidosis [38] as a result of their proneness to misfold [39]. ApoE self-accumulation properties are still poorly understood, although—as mentioned above—the APOE4 allele is known as a causative risk factor for the neurodegenerative AD [40,41]. ApoE has been characterized as a potential Aβ chaperone in AD, suggesting the strong tendency between these two macromolecules to interact. Interestingly, apoE misfolding was proposed as the first step towards Aβ nucleation and polymerization. In any case, the outstanding appearance of apoE in AD and other neurodegenerative diseases is attributed to the fact that lipid transport in cerebrospinal fluid (CSF) is mediated by HDL particles rich in apoE [42–44].

In the context of the "amyloid stretch hypothesis", which proposes that amyloidogenesis is actually driven by short fragments of misfolded proteins [45], scientists have extensively been studying a variety of short aggregation-prone stretches, with a potential to guide amyloid fibril formation from a soluble globular domain [46–53]. Based on this idea, many algorithms have been developed, in an attempt to extract the information of amyloidogenicity only from primary protein sequences [54]. Among them, AMYLPRED, a consensus prediction algorithm developed in our lab [37], was used to identify regions with amyloidogenic properties in the amino acid sequence of apoE (Figure 1b). The ultimate aim of the present study was to characterize the amyloidogenic properties of apoE3—the most common form in human population. For this purpose, we have used a combination of TEM, X-rays, polarizing microscopy ATR-FTIR spectroscopy, and molecular dynamics simulations to test whether the predicted apoE fragments can influence aggregation of either apoE or the oligomeric Aβ interacting partner. Our biophysical approach indicates that two aggregation-prone apoE hot-spots (Figure 1a, peptides 132ELRVR136 and 158RLAVY162 shown in orange) have strong self-association properties and destabilize the apoE lipid-free topology. Further, molecular details of the interaction between apoE and oligomeric Aβ, derived by our computational results, also profile the impact of hidden amyloidogenic apoE regions in AD.

#### **2. Results and Discussion**

#### *2.1. Computational Identification of apoE Hot-Spots*

After a computational scanning, AMYLPRED revealed a weak overall amyloidogenic tendency for apoE, in contrast to other amyloidogenic apolipoproteins studied before [55,56]. The consensus prediction recognized two regions of apoE, namely, 133LRV135 and 159LAV161, as peptides with aggregation potency that exceeds the AMYLPRED threshold (Figure 1b). Both peptides werelocated in the same α-helix corresponding to the N-terminal four-helix bundle domain (Figure 1a, orange). According to AMYLPRED, predicted aggregation hot-spots were only found in the helix bundle of apoE that includedthe primary binding epitope for both lipids and Aβ [57], although previous in vitro aggregation assays revealed the C-terminal part as the most amyloidogenic apoE domain [58]. Arginine 112, rendering ApoE4 the least stable apoE isoform [23,59], does not affect the amyloidogenic profile of different apoEs. Analogous hot-spots traced in all apoE forms since the 133LRV135 peptide is a commonly predicted segment for all three apoEs, while the 159LAV161 was found only in the apoE3 and apoE4 isoforms (Figure S1). The 133LRV135 peptide is an important functional region, since it is neighboring to the LDL receptor binding domain of the molecule [34]. It has been suggested that the C-terminal apoE domain dissociates causing exposure of the four-helix bundle of apoE [33]. This finding is in good agreement with our prediction and verifies the idea that aggregation-prone regions are not buried [37]. We hypothesize that a critical apoE conformational transition can uncover both 133LRV135 and 159LAV161 aggregation-prone segments, and thus, can initiate apoE misfolding (See MD results below). In this study, predicted regions were extended from both ends, following the idea

that five-residue-long peptides are sufficient to independently form amyloid-like fibrils [60], and thus, 132ELRVR136 and 158RLAVY162 pentapeptide–analogues were experimentally used to pinpoint segments that play crucial role in the self-assembly process of apoE and in the molecular recognition of Aβ.

#### *2.2. Isolated apoE Peptide–Analogues Fulfill All Basic Amyloid Criteria*

Designed apoE peptide–analogues 132ELRVR136 and 158RLAVY162 were thoroughly examined and found to self-assemble, forming fibril-containing gels after an incubation period of one week. As observed by negative staining TEM, both 132ERLVR136 and 158RLAVY162 fibrillar populations were measured to have similar diameters (Figure 2a,b). The thinnest single fibril of the 132ELRVR136 peptide–analogue hadan average diameter of 100 Å, whereas the 158RLAVY162 peptide thickness wasapproximately 110 Å. However, the overall arrangement of the fibrils in each gel seems to differ between the two peptides, possibly owing to differences between the peptide–peptide interactions, acting as building blocks of the fibrillar core [61]. Congo red was shown to selectively bind on thin hydrated films derived by both peptides, as seen under bright field illumination. The characteristic yellow/green birefringence wasclearly seen under crossed polars of a polarizing microscope (Figure 2c,d).

**Figure 2.** Experimental results of self-aggregation assays for apoE peptide–analogues. (**a**,**b**) Electron micrographs of typical amyloid fibrils, derived by self-assembly of (**a**) 132ERLVR136 and (**b**) 158RLAVY162 "aggregation-prone" fragments. Scalebars for (**a**) 132ERLVR136 and (**b**) 158RLAVY162 are 200 nm and

500 nm, respectively. (**c,d**) Photomicrographs of apoE peptide fibrils stained with the amyloid specific Congo red dye ((**c)** 132ERLVR136 and (**d**) 158RLAVY162). The apple-green birefringence, characteristic for all amyloid fibrillar materials, is clearly seen (Scale bar 500 μm).(**e**,**f**) X-ray diffraction patterns from oriented fibers of apoE "aggregation-prone" fragments, (**e**) 132ERLVR136 and (**f**) 158RLAVY162.

X-ray fiber diffraction and FT-IR experiments have all shown that in their fibrillary form both peptides adopt a well-defined β-sheet conformation. The X-ray patterns indicate that fibrils from both the 132ELRVR136 and 158RLAVY162 peptide–analogues possess the typical "cross-β" architecture of amyloid fibrils (Figure 2e,f). Concerning the 132ELRVR136, a strong -but diffuse- 4.6 Å reflection is seen in the diffraction pattern, in addition to an 11.7 Å structural repeat. The former reflections may be attributed to the periodic distance between consecutive hydrogen-bonded β*-*strands, which are aligned perpendicular to the fiber axis, and the repetitive distance between packed β-sheets aligned parallel to the fiber axis, respectively. In addition to the typical "cross-β" repetitions, a reflection measured at 23.7 Å could be indicative of the inter-sheet distance (half of the 23.7 Å is approximately 11.7 Å), indicating a long-range order of packed β-sheets in the fiber. Finally, the reflection at 15.1 Å may be attributed to the length of the extended 132ERLVR136peptide. The respective reflections in the diffraction pattern of the 158RLAVY162 peptide were measured to be at 4.6 Å, representing the repetitive interchain distance between β-strands and 11 Å, corresponding to the inter-sheet stacking periodicity, both closely resembling typical "cross-β" patterns taken from amyloid fibrils. An additional spacing at 20.8 Å is the evidence for the distance between ordered and packed β-sheets (half of the 20.8 Å is approximately 11 Å). Reflections were also verified utilizing ZipperDB [62] models that overlap with 132ELRVR136 and 158RLAVY162 peptide–analogues (data not shown). ATR FT-IR was subsequently used to access the secondary structure characteristics of both peptides and to verify the results derived by X-rays. An ATR FT-IR spectrum of a thin-film cast from suspensions of the amyloid-like fibrils of the peptide–analogue132ERLVR136 (Figure 3a) shows prominent bands at 1627 cm-1and 1539 cm-1, in the amide I and II regions, respectively, indicating the presence of β*-*sheets. A band at 1695 cm-1is indicative of anti-parallel β*-*sheets (Table 1). Similarly, in the spectrum of 158RLAVY162 (Figure 3b), the bands at 1631 cm-1 (amide I) and 1548 cm-1(amide II) are also attributed to β*-*sheets, whereas the band at 1689 cm-1 is attributed to anti-parallel β*-*sheets (Table 1).

Our experimental analysis reveals that apoE peptide–analogues 132ELRVR136 and 158RLAVY162 hada strong propensity to independently form β-aggregates, fulfilling the basic amyloid criteria. This finding is compatible with the proposed apoE aggregation pathway suggesting that a minor apoE fraction forms β-strands that stabilize the apoE fibril core [63].


**Table 1.** Bands observed in the ATR FT-IR spectra obtained from thin films, containing suspensions of fibrils, produced by apoE peptide–analogues, and their tentative assignments.

**Figure 3.** FT-IR spectra (1100–1800 cm<sup>−</sup>1) derived from suspensions of fibrils, produced from (**a**) 132ERLVR136 and (**b**) 158RLAVY162. Each apoE peptide cast on a flat stainless-steel plate and left to air-dry slowly at ambient conditions to form hydrated, thin films. Each film possesses a β-sheet conformation, as it is evident by the presence of strong amide I and II bands.

#### *2.3. Implication of apoE Peptide–Analogues in the Tertiary Structural Stability of apoE*

Molecular dynamics simulations were carried out on the most representative NMR conformer of apoE3 [33], putting the spotlight on the implication of the experimentally tested amyloidogenic peptide–analogues 132ELRVR136 and 158RLAVY162. Computational tests assessed the structural stability, integrity, and dynamic behavior of apoE over time (300 ns) under physiological pH conditions at 300 K. Structural movements were monitored over the course of the simulations through time-dependent root mean square deviation (RMSD) measurements with respect to the starting configuration, to evaluate apoE overall structural transitions, as well as through per-residue root mean square fluctuation (RMSF) calculations to monitor the mobility of specific regions.

Fibril-forming segments (132ELRVR136 and 158RLAVY162) influence the apoE structural features over time, since a noticeable difference found between the starting conformation (Figure 4, 0 ns frames) and the 300 ns conformation (Figure 4, 300 ns frames). The N-terminal domain kept its bundle-structure throughout the simulation, and only a slight conformational tilt was observed in the 3D shape of the molecule (Figure 4, 300 ns frames). Conversely, the C-terminal domain was characterized by large fluctuations (8–10 Å) with respect to the N-terminal domain, possibly due to the higher solvent exposure (Figure S2a, blue curve). Root mean square fluctuation calculations reveal approximately 10 Å deviation between residues Glu270 and His299, corresponding to the C-terminal domain (Figure S3). This result is common for apoE since similar conformational changes allow the four-helix bundle to emerge during lipid binding [33]. It is also believed that C-terminal fluctuations allow new interactions or α-helix to β-sheet conversion, due to partial destabilization of apoE, subsequently resulting in self-assembling. In either case, conformational instability of the C-terminal domain exposes aggregation-prone segments 132ELRVR136 and 158RLAVY162, otherwise hidden in the core of

apoE. The overall conformational variations of segments 132ELRVR136 and 158RLAVY162 are visually inspected in Figure S2. 158RLAVY162 exhibited higher conformational mobility, meaning that this segment participatedin transient C-terminal conformational changes (Figure S2b, orange triangles). The conformational unraveling of the most aggregation-prone part of apoE (according to AMYLPRED, Figure 1a) explains the intrinsic apoE propensity to form amyloid-like fibrils [63]. Our aggregation assays in combination with computational MD results suggest that the C-terminal domain protects the aggregation-prone part of apoE from misfolding, by covering the aggregation-prone regions 132ELRVR136 and 158RLAVY162 located at the N-terminal domain. This finding is in agreement with the computational analysis by Das and Gursky [55].

**Figure 4.** Dynamics simulations of an apoE NMR structure for 300 ns. The N-terminal domain is shown in green, the C-terminal domain is shown in blue, and the hinge domain is shown in red. "Aggregation-prone" hot-spots 132ERLVR136 and 158RLAVY162 are colored in orange. Structural movements uncover otherwise hidden apoE hot-spots (arrows). Models are represented in 0◦, 90◦, and 180◦.

#### *2.4. An apoE Aggregation Hot-Spot Anchors Oligomeric A*β

Numerous studies demonstrate that apoE is a component of peripheral deposits and senile plaques of AD patients [64–66]. In vitro experiments have shown that co-incubation of apoE3 and apoE4 with Aβ peptide induces the fibrillation of the peptide [67,68], supporting the idea that apoE specifically interacts with the Aβ. One mechanism by which apoE might be involved in the pathology of AD is by modulating the activity of Aβ and binding in oligomeric Aβ species [69,70]. Molecular docking was employed towards the identification of the Aβ and apoE epitopes in the Aβ–apoE complex. The Aβ aggregation profile was analyzed utilizing AMYLPRED (Figure S4). Two aggregation-prone regions were predicted comprising an N-terminal pentapeptide (KLVFFA) and a longer C-terminal thirteen-residue-long peptide (GAIIGLMVGGVVI) (Figure S4). Previous experimental studies have shown that both regions have self-aggregation properties and have been suggested as crucial amyloidogenic determinants of Aβ [71].

Supervised molecular docking was performed for building the Aβ–apoE complex using the 300 ns apoE conformation as the initial structure for the N-terminal apoE domain (described above) and the 2BEG NMR structure [72] as the 3D structure of Aβ oligomers. The apoE binding epitope was restricted between residues 130 to 165, based on reliable information from experimental and computational studies, pinpointing this apoE domain as the major interacting part of apoE withAβ [57]. Predicted Aβ aggregation-prone regions were used as computational restraints in HADDOCK. The identification of the complex ascertained the interaction between the C-terminal aggregation-prone epitope of Aβ and the amyloidogenic 132ELRVR136 peptide, located at the N-terminal apoE domain. This cluster evaluated having the best HADDOCK score, whichcorresponds to the smallest weighted HADDOCK sum (Figure S5, 0 ns).

Having investigated the most favorable Aβ–apoE complex, the next step was to evaluate its dynamics and stability. After 100 ns all-atom MD simulations, a complex dissociation was observed and the structure of the Aβ oligomer changed. Similar results were observed after 200 and 300 ns simulation time (Figure S5). Despite the secondary structure alterations, the interaction interface between Aβ and apoE retained over time. This verifies recursively the spatial position emerged from the molecular docking model (Figure S5, 0 ns). The structure of the apoE N-terminal domain was recorded as the most stable entity, since this domain displayed similar dynamic behavior over time. This behavior is consistent with the simulation results observed for the representative NMR conformer of full-length apoE3, presented above (Please refer to Section 2.3). Except for significant changes in Aβ orientation and stability, the Aβ C-terminal epitope remained constantly attached to the amyloidogenic 132ELRVR136 peptide over ns time. The Aβ oligomer's instability wasreasonable, since oligomeric states are significantly unstable compared to the amyloid state of proteins [73]. Given the competitive relationship between lipid-free apoE molecules tending to self-assemble, and Aβ oligomers "willing" to interact with apoE monomers [74,75], we hypothesized that these computational results give new insights into Aβ–apoE's delicate interconnection. This computational outcome provides some details into the intermolecular and intramolecular interactions, associated with the formation of homomeric or heteromeric supramolecular assemblies, which may be the key to target protein misfolding diseases.

#### **3. Materials and Methods**

#### *3.1. Identification of Aggregation-Prone Peptides in apoE and A*β

Human apoE sequence (UniProtKB: P02649/APOE\_HUMAN), corresponding to APOE3 allele, and human Amyloid-beta precursor protein (APP) fragment 672–713 (UniProtKB: P05067/A4\_HUMAN), namely, Aβ1–42, were analyzed with AMYLPRED [37] for identifying fibril-forming aggregation hot-spots. Fibril-forming segments chosen for this study were predicted at least by two predictors (default AMYLPRED threshold). Figure S1 and Figure S4 illustrate the consensus AMYLPRED prediction for all apoE isoforms and Aβ, respectively.

#### *3.2. Peptide Design, Synthesis,and Preparation of Peptide Samples*

Based on the amyloidogenic profile of apoE (Figure S1), 2 short pentapeptide–analogues were designed. Since, according to previous studies, sequence stretches in proteins should comprise a minimum of five consecutive residues, and AMYLPRED predictions were extended from both ends. The pentapeptide–analogues 132ELRVR136 and 158RLAVY162, corresponding to the 4th helix of the four-helix bundle of apoE (Figure 1, orange), were chemically synthesized in high peptide purity (>98%) by GeneCust© Europe, Luxemburg. Peptide-analogues have free N- and C-terminals. Lyophilized aliquots of both pentapeptides were re-suspended in distilled water (pH 5.5) at concentrations up to 15 mg ml-1 and incubated at ambient temperatures for 1–2 weeks. Both pentapeptides were found to produce fibril-containing gels.

#### *3.3. X-ray Di*ff*raction*

For each peptide–analogue a droplet (~10μL) of mature fibril suspension was placed between two quartz capillaries covered with wax. Capillaries spaced ~1.5 mm apart and mounted horizontally on a glass substrate, as collinearly as possible, in order to obtain an oriented fiber. The X-ray diffraction pattern from this fiber was collected at a P14 beamline synchrotron (Petra III, EMBL-Hamburg, Germany) operated at a wavelength of 1.23953 Å, with a 6M PILATUS detector. The specimen-to-film distance was set at 225.11 mm and the exposure time was set to 1 s. The X-ray patterns were initially viewed using the program CrysAlisPro [76,77] and subsequently displayed and measured with the aid of the iMosFLM [78] program [78].

#### *3.4. Negative Staining and Transmission Electron Microscopy*

For negative staining, droplets (3–5 μL) of the mature fibril suspensions were applied to glow-discharged 400-mesh carbon-coated copper grids for 2 min. The grids were stained with a droplet (5 μL) of 2% (*w*/*v*) aqueous uranyl acetate for 60 s and the excess staining was removed by blotting with a filter paper. The fibril-containing grids were initially air-dried and subsequently examined with a Morgagni™ 268 transmission electron microscope, operated at 80 kV. Digital acquisitions were performed with an 11-Mpixel side-mounted Morada CCD camera (Soft Imaging System, Muenster, Germany).

#### *3.5. Attenuated Total Reflectance Fourier-Transform Infrared Spectroscopy (ATR FTIR) and Post-Run Computations of the Spectra*

A 10-μL droplet of each apoE peptide mature fibril suspension was cast on flat stainless-steel plates, coated with an ultrathin hydrophobic layer (SpectRIM, Tienta Sciences, Inc. Indianapolis, IN, USA) and left to dry slowly at ambient conditions in order to form thin hydrated films. Infrared spectra were obtained from these films at a resolution of 4 cm−1, utilizing an IR microscope (IRScope II by Bruker Optics) equipped with a Ge attenuated total reflectance (ATR) objective lens (20×) and attached to a Fourier-transform infrared (FTIR) spectrometer (Equinox 55, by Bruker Optics). Ten 32-scan spectra were collected from each sample and averaged to improve the sound/noise (S/N) ratio. Both are shown in the absorption mode after correction for the wavelength dependence of the penetration depth (pd~λ). Absorption band maxima were determined from the minima in the second derivative of the corresponding spectra. Derivatives were computed analytically using routines of the Bruker OPUS/OS2 software, including smoothing over a <sup>±</sup>13 cm<sup>−</sup>1range around each data point, performed by the Savitzky–Golay algorithm [79]. Smoothing over narrower ranges resulted in a deterioration of the S/N ratio and did not increase the number of minima that could be determined with confidence.

#### *3.6. Congo Red Staining and Polarized Light Microscopy*

Fibril suspensions of the peptide solutions were applied to glass slides and stained with a 10 mM Congo red (Sigma) solution in distilled water (pH 5.5) for ~30 min. Excess staining was removed by several washes with distilled water and left to dry for approximately 10 min. The samples were observed under bright field illumination and between crossed polars, using a Leica MZ75 polarizing stereomicroscope, equipped with a JVC GC-X3E camera.

#### *3.7. Molecular Docking and Molecular Dynamics Simulations*

For deriving a structural model of the Aβ–apoE complex, the web server version 2.2 of HADDOCK was used [80]. The HADDOCK score was used to rank and evaluate the generated clusters. The scoring wasthe weighted sum of a linear combination of various energies and buried surface area between molecules constituting the complex. A number of molecular dynamics (MDs) simulations were designed and performed for apoE next, both in its monomeric form and in its complex with Aβ protofibrils. Each protein system was inserted into a cubic solvent box, with a minimum distance of at least 1.5 nm between the box's boundaries and protein coordinates. The solvent was modeled using the TIP3P water model [81] and the systems were ionized using NaCl counter-ions to neutralize unwanted charges and an ambient NaCl ion concentration of 0.15 M, mimicking neutral pH conditions. Each simulation system was subjected to thorough energy minimization, followed by two stages of equilibration simulations with position restraints applied on protein coordinates, namely, a 500 ps simulation in the canonical (NVT) ensemble to equilibrate temperature and a 1 ns simulation in the isothermal-isobaric (NPT) ensemble to equilibrate pressure. An additional 1 ns equilibration simulation

was also performed without any restraints. Finally, production simulations were performed in the NPT ensemble for 300 ns.

All simulations were performed using GROMACS v. 2016.3 [82] and the AMBER 99SB-ILDN force field [83]. The LINCS algorithm [84] was applied to model bond constraints, enabling the use of a 2 fs time-step. Short range non-bonded interactions were modeled using a twin-range cutoff at 0.8 nm, while long-range electrostatic interactions were modeled using the Particle Mesh Ewald (PME) method [85], with a Fourier grid spacing at 0.12 nm and a cubic interpolation (PME rank 4). Temperature was maintained at 300 K with separate couplings for the proteins and solvent, using the Berendsen weak coupling algorithm [86] during equilibration and the Nosé–Hoover thermostat [87,88] in the production simulations, with a coupling constant of τ<sup>T</sup> = 0.1 ps. Pressure was isotopically controlled at 1.013 bar (1 atm), using the Berendsen weak coupling algorithm [86] during equilibration and the Parrinello–Rahman barostat [89] in the production simulations, with a coupling constant of <sup>τ</sup><sup>P</sup> <sup>=</sup> 2.0 ps and a compressibility of 4.5 <sup>×</sup> <sup>10</sup>−<sup>5</sup> bar<sup>−</sup>1. Simulation results were analyzed using various GROMACS utilities, and Visual Molecular Dynamics (VMD) v. 1.9.4 [90]. Pictures were collected with PyMOL [91].

#### **4. Conclusions**

The purpose of this study was to investigate the poorly explored amyloidogenic properties of human apolipoprotein E [58,63], a protein closely associated with disorders with worldwide prevalence, such as Alzheimer's disease [23,24]. Wild-type or mutated apolipoproteins, evolutionarily related to apoE, have been found in depositions of amyloid fibrils in vivo in several amyloidosis [92–101]. More specifically, apoCII and apoCIII has been reported to form amyloid fibrils both in vitro [102,103] and recently in vivo, causing rare forms of hereditary systemic amyloidosis [104,105]. These two newly identified fibril proteins expand the list of amyloidogenic apolipoproteins associated with amyloidoses [38] and draw attention to unknown aggregation properties of the apolipoprotein family.

In this work, AMYPRED was used in order to probe hidden amyloidogenic motifs on the apoE polypeptide chain, whereas several biophysical and computational techniques were applied to characterize its properties. The results of our experimental work prove that predicted aggregation-prone apoE peptides self-assemble into amyloid-like fibrillar structures, displaying the main structural and tinctorial features of amyloids [106,107]. Computational tests evaluated the contribution of these peptides into the stability of apoE and explored their affinity to oligomeric Aβ. Molecular dynamic simulations revealed that both predicted apoE peptide–analogues undergo a critical structural transition that under the right in vitro conditions may result in apoE instability. Importantly, the amyloidogenic 132ELRVR136 peptide, a commonly predicted segment for all three apoE isoforms, emerged as the most favorable apoE epitope "attracting"the C-terminal epitope of the oligomeric Aβ. Overall, self-aggregation properties of apoE peptides, described here, add considerable further details as they establish a mechanistic explanation of apoE misfolding and involvement with oligomeric Aβ. As an extension to these conclusions, the concept of interacting amyloidogenic regions, by separate partners found in amyloidoses, offers hope of new anti-aggregation treatment directions.

**Supplementary Materials:** Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/9/2274/s1.

**Author Contributions:** Conceptualization, P.L.T., V.A.I.; Methodology, P.L.T., F.A.B., N.N.L., V.A.I.; Validation, P.L.T., A.D.K., F.A.B., N.N.L.; Formal analysis, P.L.T., A.D.K.; Investigation, P.L.T., A.D.K., V.A.I.; Resources, V.A.I.; Writing, original draft preparation, P.L.T., A.D.K.; Writing, review and editing, P.L.T., V.A.I.; Visualization, P.L.T., V.A.I.; Supervision, V.A.I.; Funding acquisition, V.A.I.

**Funding:** The present work was co-funded by the European Union and Greek national funds through the Operational Program "Competitiveness, Entrepreneurship and Innovation", under the call "RESEARCH-CREATE-INNOVATE" (project code: T1EDK-00353).

**Acknowledgments:** We thank the Institute of Biology, Medicinal Chemistry and Biotechnology at the National Hellenic Research Foundation for access to the X-ray diffraction facility. We acknowledge the help of EvangeliaChrysina with the X-ray diffraction experiments. The help of George Baltatzis and EfstratiosPatsouris and the use of the Morgagni Microscope at the 1st Department of Pathology, Medical School, the National and Kapodistrian University of Athens is also gratefully acknowledged. We should also like to sincerely thank the two handling editors and the reviewers of this manuscript for their very useful and constructive criticism.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **A Clinical Approach for the Use of VIP Axis in Inflammatory and Autoimmune Diseases**

#### **Carmen Martínez 1,\*, Yasmina Juarranz 1, Irene Gutiérrez-Cañas 1, Mar Carrión 1, Selene Pérez-García 1, Raúl Villanueva-Romero 1, David Castro 1, Amalia Lamana 1, Mario Mellado 2, Isidoro González-Álvaro <sup>3</sup> and Rosa P. Gomariz <sup>1</sup>**


Received: 30 November 2019; Accepted: 18 December 2019; Published: 20 December 2019

**Abstract:** The neuroendocrine and immune systems are coordinated to maintain the homeostasis of the organism, generating bidirectional communication through shared mediators and receptors. Vasoactive intestinal peptide (VIP) is the paradigm of an endogenous neuropeptide produced by neurons and endocrine and immune cells, involved in the control of both innate and adaptive immune responses. Exogenous administration of VIP exerts therapeutic effects in models of autoimmune/inflammatory diseases mediated by G-protein-coupled receptors (VPAC1 and VPAC2). Currently, there are no curative therapies for inflammatory and autoimmune diseases, and patients present complex diagnostic, therapeutic, and prognostic problems in daily clinical practice due to their heterogeneous nature. This review focuses on the biology of VIP and VIP receptor signaling, as well as its protective effects as an immunomodulatory factor. Recent progress in improving the stability, selectivity, and effectiveness of VIP/receptors analogues and new routes of administration are highlighted, as well as important advances in their use as biomarkers, contributing to their potential application in precision medicine. On the 50th anniversary of VIP's discovery, this review presents a spectrum of potential clinical benefits applied to inflammatory and autoimmune diseases.

**Keywords:** vasoactive intestinal peptide; VPAC1 receptor; VPAC2 receptor; rheumatic diseases; inflammatory bowel disease; central nervous system diseases; type 1 diabetes; Sjögren's syndrome; biomarkers

#### **1. Introduction**

The nervous, endocrine, and immune systems are coordinated to maintain the homeostasis of the organism, generating bidirectional communication through shared mediators and receptors [1,2].

Vasoactive intestinal peptide (VIP) is the paradigm of an endogenous neuropeptide produced during autoimmune responses and processes of systemic and local inflammation. It acts as an immunomodulatory agent to restore homeostasis of the immune system [3]. Synthesized by neurons and endocrine and immune cells, VIP is involved in the control of both innate and adaptive immune responses [4–6]. Exogenous administration of VIP exerts therapeutic effects in models of autoimmune/inflammatory diseases mediated by two G-protein-coupled receptors (VPAC1, VPAC2) [7–12].

Inflammatory and autoimmune diseases include a clinically heterogeneous group of chronic diseases sharing inflammatory mechanisms, as well as a deregulation of the immune system [13,14]. These diseases can affect any organ or system and are often multiorganic. Among these pathologies, we find rheumatic diseases such as rheumatoid arthritis (RA), inflammatory bowel diseases (IBD), and multiple sclerosis (MS).

According to the Autoimmune Diseases Coordinating Committee of the National Institutes of Health in the United States, the prevalence of autoimmune pathologies is estimated at up to 8% of the population. These pathologies are characterized by a complex etiology combining different genetic, epigenetic, and environmental factors, such as tobacco use or history of infections, which result in the alteration of the regulation of the immune system [14–17]. These diseases lead to substantial levels of morbidity, a significant reduction in the quality of life, and premature death [18–21].

Currently, there are no curative therapies for inflammatory and autoimmune diseases, and patients present complex diagnostic, therapeutic, and prognostic problems in daily clinical practice due to their heterogeneous nature. Many of these challenges could be alleviated with appropriate biomarkers, allowing a more efficient use of current therapies, as well as the development of precision medicine.

This review focuses on the biology of VIP and VIP receptor signaling, as well as its protective effects as an immunomodulatory factor. Here, we consider their role in the pathogenesis of autoimmune diseases and inflammatory disorders and address the potential clinical application of the VIP/receptor axis.

#### **2. Biological Characteristics of VIP**

#### *2.1. VIP Discovery, Cellular Location, and Structure*

Sami Said described, for the first time in 1969, the existence of a peptide vasoactive agent with systemic vasodilator capacity present in the lungs of mammals. In collaboration with Viktor Mutt, Said purified this peptide from pig lungs, but only partially. Challenges in isolating it from the lungs led them to examine the intestine, since both tissues have a common embryonic origin. Thus, using porcine duodenal tissue, they isolated this vasodilator peptide and presented it to the scientific community, calling it the vasoactive intestinal peptide [22].

A few years later, the presence of this peptide was demonstrated in different areas of the central and peripheral nervous system, such as the bodies, axons, and neuronal dendrites [23], as well as in presynaptic endings [24], resulting in the categorization of the VIP as a neuropeptide with neuromodulatory and neurotransmitter functions. This role was confirmed with the characterization of VIP receptors in numerous areas of the central nervous system (CNS) [25].

In the immune system, the first information dates back to 1985, when Felten et al. described VIP-like immunoreactivity in the thymus nerve endings [26]. Since then, VIP-ergic innervation in the spleen, lymph nodes, and mucosal-associated immune system has been demonstrated [27]. It is also important to note that sympathetic nervous system fibers innervate the joints, which explains the role of VIP in rheumatic diseases.

Regarding the cellular source involved in VIP production, the first evidence was reported for cells of myeloid lineage. Expression in mast cells was demonstrated by radioimmunoassay and immunohistochemistry in the rat peritoneum, intestine, and lung [28]. In 1980, O'Dorisio described the presence of VIP in human peripheral blood polymorphonuclear cells, especially neutrophils, but not in mononuclear cells [29]. VIP expression has also been described in human eosinophils [30] and in eosinophils of granulomatous lesions induced by infection with *Schistosomiasis mansoni*[31]. We reported that neither M1 nor M2 human macrophages express transcripts of VIP [32]. Concerning cells of a lymphoid lineage, in the 1990s, our team reported, for the first time, the synthesis and secretion of VIP in murine T and B lymphocytes [33–35]. Since then, information on the important role VIP plays in inflammation and autoimmunity continues to accumulate. Today, VIP is an important player in the circuit formed by the nervous, endocrine, and immune systems. It is also present in rheumatic diseases [36] and is one of the most studied peptides in terms of a physiological role in health and disease, especially in the immune system.

The origin of VIP in the microenvironment of the different pathologies in which its effect has been studied is the nerve endings and cells. In this sense, nerve fibers of the sympathetic nervous system in the joints have been reported in rheumatic diseases. Moreover, a decrease in the number of these nerve endings has been described in osteoarthritis (OA) and RA [37]. Regarding cellular origin, synovial fibroblasts (SF) from OA and RA patients have been found to express and release VIP [38].

VIP belongs to a broad family of neuropeptides and hormones, related both structurally and at sequence level, called the secretin/VIP family. In addition to VIP and secretin, this family includes the adenylate cyclase activating peptide pituitary (PACAP) 27 and PACAP38, helodermin, histidine-methionine peptide (PHM, in humans) or histidine-isoleucine peptide (PHI, in other mammals), the releasing factor of growth hormone (GHFR), glucagon and its related peptides GLP1 and GLP2, and the gastric inhibitor peptide (GIP) [39]. The structural homology observed among the different members of this family is very high, with the following characteristics being common [40]: (I) precursor peptide formed by a signal peptide, from 1 to 3 bioactive peptides and N- and C-terminal peptides; (II) length of the mature peptide comprised of between 25 and 50 aa residues; (III) synthesis and release by nerve, immune, and/or endocrine cells; (IV) patent tendency for the formation of α-helix structures; and (V) presence of a structural motif called N-Cap in the amino terminal region. The helical structure seems to be a key element in the interaction with receptors and signaling and is considered an interesting therapeutic target [41]. These peptides show strong homology in their amino acid sequences on an evolutionary scale, suggesting a common origin from an ancestral gene [42]. VIP is a 3.326 Da molecular weight peptide, with a basic nature and amphipathic character. Its primary structure consists of a single chain with 28 aa whose sequence has been highly conserved throughout evolution [43]. Although the presence of all of these aa is necessary for VIP to perform its biological functions, it has been proven that certain residues are crucial for this performance (His1, Val5, Arg14, Lis15, Lis21, Leu23, and Ile26). The secondary structure has a random coil in the N-terminal region and an α-helix structure in the C-terminal region. This structure is similar to that of other family members, especially that of PACAP27, with whom it shares 68% sequence homology [44].

#### *2.2. General Biological Functions*

The expression of VIP in the nervous system results in its release in multiple organs by releasing nerve fibers. Thus, VIP is present in the innervation of the heart, kidney, lung, thyroid gland, and gastrointestinal and urogenital tracts. As we have described previously, central and peripheral lymphoid organs, such as the thymus, spleen, and lymph nodes, are also innervated by VIP sympathetic nerve fibers. Moreover, VIP expression in cells of myeloid and lymphoid origin also contributes to its broad distribution, correlating with its functional pleiotropism. Thus, VIP acts as a neurotransmitter, immunoregulator, vasodilator, and stimulator of hormone secretion or secretagogue [6]. VIP contributes to a wide variety of physiological activities related to development, growth, immune response, circadian rhythms, endocrine control, and functions of the digestive, respiratory, cardiovascular, and reproductive systems [39]. Some of VIP's multiple biological activities include increased cardiac output, bronchodilation, smooth muscle relaxation, regulation of secretion processes, and motility in the gastrointestinal tract. In addition, as a secretagogue, VIP promotes the release of prolactin, luteinizing hormone, and growth hormone by the pituitary gland and regulates the release of insulin and glucagon in the pancreas. This peptide also promotes analgesia, hyperthermia, learning, and behavior; it has neurotrophic effects and regulates bone metabolism and embryonic development [45].

#### **3. VIP Receptors, Ligands, and Signaling Pathways**

#### *3.1. VIP Receptors*

VIP and PACAP were discovered in the 1970s and 1980s, respectively, and cloned in the 1980s and 1990s. The existence of several "VIP receptors" was inferred by pharmacological studies, such as cyclic AMP (cAMP) or radioligand binding assays, long before the actual receptor cloning. In this way, "VIP receptors" were described in normal and tumor cells and tissues [40]. These receptors showed pharmacological differences and were not correctly identified until the description of several ligands, agonists, and antagonists, and the cloning of the receptors.

VIP receptors belong to group B of G-protein-coupled receptors (GPCRs), which include seven transmembrane receptors that represent the most extensive family of signaling proteins. Ligands for class B GPCRs are peptides that bind to the large N-terminal part of the GPCR [46]. There are three receptors recognized by VIP: VPAC1, VPAC2, and PAC1 receptors. VIP binds to VPAC receptors with equal affinity and with much less affinity to PAC1, which is the PACAP-preferred receptor. The structures of several class B receptors have been determined, as well as crystal structures of peptide-bound receptors, which help to understand how the peptides can bind to their receptors and how these receptors undergo conformational changes to allow downstream signaling [47].

#### 3.1.1. VPAC1 Receptor

The rat VPAC1 receptor was cloned in 1991 from a cDNA library, and the human VPAC1 was cloned in 1993 from the HT-29 cell line [48,49]. Only one variant of this receptor has been described thus far, which is expressed in several normal and malignant cells. It has a deletion that results in a receptor with five transmembrane domains lacking the G-protein binding domain. Even so, it can activate protein tyrosine kinase activity, but in a different way than the seven transmembrane domain receptor [50].

The affinity of several peptides for this receptor is as follows: VIP = PACAP > GRF > secretin [40]. In the last decade, advances in the study of molecule structures have allowed the dissection of the physical sites of interaction between VPAC1 and VIP, observing that the side chains of several residues in the VIP sequence are in contact with several others in the receptor sequence, although the whole interaction between the two molecules has yet to be elucidated. Nevertheless, all the models available are in concordance with the mechanism proposed for the ligand–receptor interaction for this family of receptors, the "two domain" model, in which part of the peptide remains inside the N-terminal ectodomain of the receptor, while the N-terminus of the peptide is able to interact with the transmembrane region of the receptor [39].

#### 3.1.2. VPAC2 Receptor

This receptor was also first cloned from rats, with the human and mouse receptors cloned shortly thereafter [51–53]. The first variant of this receptor described in mice tissues showed a deletion in exon 12, which corresponds to the carboxyl-terminal end of the seventh transmembrane domain. This variant lacks its normal function of increasing cAMP [54]. The second variant found in a human malignant T-cell line presented a deletion in exon 11 and had lower affinity for VIP [55]. The most recently described was the same variant as that described for the VPAC1 receptor [50]. The order of affinity for human VPAC2 expressed in different cell lines is VIP = PACAP = helodermin > secretin [40].

#### *3.2. Ligands*

During the last four decades, many ligands have been developed, both agonists and antagonists for VIP receptors. Most of them were created by modifying endogenous peptides and displayed different affinities and selectivities, with the first descriptions unable to differentiate between the two receptors [56–58]. Selective agonists for VPAC1 receptor have been generated, such as [K15, R16, L27]VIP(1-7)/GRF(8-27) [59], [Ala11,22,28]VIP [60], [L22]VIP [61], [R16]PACAP(1-23) [62], and LBT-3393 [63]. A selective antagonist is also available: PG97-269 [64]. Regarding VPAC2, cyclic peptides have been demonstrated to be selective agonists, such as Ro25-1553 [65], Ro25-1392 [66], and some other peptides, such as BAY 55-9837 [67] Hexanoyl [A19,K27,28]VIP, rRBAYL [68], and LBT-3627 [63]. Only two VPAC2 selective antagonists have been described thus far: PG99-465 [69] and VIpep-3 [70]. Some of these approaches aim to identify more metabolically stable peptides [63] in order to ameliorate their in vivo administration. Recently, one in silico study predicted possible structures defining affinities for these receptors, constructing classifiers to predict the bioactivities of novel VPAC ligands and highlighting the importance of electrostatic properties in the interaction of VIP derivatives with the receptors [71]. Moreover, there are some other approaches providing tools for the study of these receptors or for their use as therapeutic tools, such as nanoparticles that enhance the half-life of one VPAC2 selective agonist [72] and specific nanobodies for the VPAC1 receptor that bind at a different site than VIP, thus without interfering with the coupling of the peptide [73].

#### *3.3. Signaling Pathways*

The main signaling pathway of VPAC receptors is their coupling to G-proteins, which are heterotrimeric proteins composed of three subunits: α, β, and γ. When stimulated, the α subunit binds to GTP and dissociates from the βγ dimer. Activated Gα moves through the membrane to its effector, the enzyme adenylate cyclase (AC), which in turn catalyzes cAMP synthesis [40]. This second messenger classically activates protein kinase A (PKA), which phosphorylates and can activate or inactivate different signaling pathways, depending on the cell type. For instance, the typical transcription factor activated by cAMP through PKA is cAMP-response element binding (CREB). PKA can also activate mitogen-activated protein kinases (MAPK) from several subfamilies. Moreover, cAMP in a PKA-independent way can activate exchange proteins directly activated by cAMP (EPAC), which is a G-protein exchange factor (GEF) for Rap small G-protein [74]. The βγ dimer interacts with several proteins, such as Ras, which in turn activates extracellular regulated kinases (ERK). Subsequently, ERK interacts with and activates phosphoinositide 3-kinase (PI3K) [75,76].

VPAC receptors can also mediate the increase in Ca2<sup>+</sup> through the activation of either Gi/o or Gq proteins. Although this pathway has shown lower potencies, relative to cAMP, the rise in calcium is of physiological relevance [77]. Less frequently, they can also activate phospholipase C (PLC) through the activation of Gi/o by a mechanism that likely involves the βγ dimer and that subsequently increases the production of inositol phosphate (IP3). This rise in IP3 increases the [Ca2+], which, together with diacylglycerol (DAG), activates PKC, which can also phosphorylate several other kinases [78]. Furthermore, PLD can be activated in a pathway involving the small G protein ARF. This phospholipase hydrolyzes phosphatidylcholine (PC), generating the signaling molecule phosphatidic acid (PA), which functions pleiotropically in several signaling pathways [79].

Receptor activity modifying proteins (RAMPs) are single pass transmembrane proteins that do not bind any known ligand and need to be coupled with any receptor to arrive at the cellular surface. VPAC1 and VPAC2 can bind the three known RAMPs, not modifying their affinity for ligands. Both receptors enhance the cell-surface expression of the three RAMPs, and co-expression of VPAC1 and RAMP2 enhances the response of IP3 without modifying cAMP signaling [80]. Furthermore, VPAC2 co-expression with RAMP1 increases the basal cAMP and is diminished when VPAC2 is co-expressed with RAMP3. Moreover, co-expression with RAMP1 and 2 enhances coupling to Gi/o/t/z, but does not modify the binding to Gs [81].

Regarding inflammatory pathways, three of the most important transcription factors activated in inflammatory processes are AP-1 (activator protein 1), NFκB (nuclear factor κB), and IRF (interferon regulatory factor). VIP is able to inhibit AP-1 and IRF activation through a PKA-dependent mechanism and can also prevent NFκB translocation to the nucleus impeding IKK activation through a cAMP independent mechanism [82–85].

The βγ dimer has been shown to bind GPCR kinases (GRKs), recruiting them to the membrane. When GPCRs are phosphorylated by GRKs, they bind arrestins, allowing receptor desensitization and

internalization and/or signaling from the inside, mainly through activation of different MAPKs [46]. In particular, VPAC1 and VPAC2 exhibit augmented desensitization when co-transfected with GRKs 2, 3, 5, and 6 [86]. Nonetheless, this is not the only way in which GPCRs have effects within the cell. Recently, the presence of intracellular GPCRs has been described, which arrive there through many different processes [87]. VPAC1 has been found to be expressed in the nucleus of several tumor cells, such as human breast cancer [88], human renal carcinoma [89], and human glioblastoma, where there is also a weak expression of nuclear VPAC2 in one of the cell lines studied [90]. More recently, the presence of VPAC1 has been observed to be located on the surface and nuclear membrane of T helper (Th) cells, whereas its expression is limited to the nucleus when these cells are activated [91]. There is no evidence regarding how VPAC2 arrives at the nucleus, and all the options described elsewhere are possible [87]. However, there is work showing that VPAC1 has a nuclear localization signal sequence in the C-terminal domain of the receptor, and in this way is able to exhibit nuclear expression [90]. It has been described that Cys37 in the VPAC1 receptor is essential for the translocation of the receptor to the nucleus and that it must be palmitoylated to be functional [92].

#### **4. A Very Important Peptide in Inflammation and Autoimmunity**

#### *4.1. Targeting Balance of Inflammatory Factors*

Inflammation is a complex homeostatic process mediated by factors of plasma and cellular origin whereby the effects of harmful stimuli are controlled in the tissues. When the inflammation persists over time, beyond what is necessary, and stops responding to the reparative process, it becomes destructive and chronic.

In chronic inflammation, there is a massive infiltration of cells involved in innate (monocytes–macrophages and dendritic cells) and adaptive immunity (TCD4<sup>+</sup> and B cells). A complex network of pro-inflammatory cytokines is established, and the cytokines are secreted primarily by activated macrophages and CD4<sup>+</sup> T cells at the site of inflammation.

The macrophage is the main producer of cytokines, and, when activated by different danger signals, it releases several pro-inflammatory products, such as interleukin (IL)-1, tumor necrosis factor (TNF-α), IL-6 and IL-l2, and nitric oxide (NO), followed later by the secretion of anti-inflammatory cytokines such as IL-10 [93]. At the site of inflammation, numerous chemokines are also secreted, exacerbating the inflammatory process by the attraction of more leukocytes. Despite its beneficial effects in the defense of the organism, the sustained production of pro-inflammatory factors can lead to pathological conditions such as septic shock, respiratory distress syndrome, and autoimmune disease [94–96].

Numerous studies, both in animal and human models, show that VIP plays a key role in maintaining homeostasis, by controlling the balance of pro- and anti-inflammatory cytokines by inhibiting the production of pro-inflammatory cytokines and chemokines such as TNF-α, IL-6, IL -12 CXCL8, and CCL2, as well as NO, and stimulating the expression of anti-inflammatory cytokines such as IL-10 [4–6].

In activated macrophages, VIP inhibits the production of TNF-α, IL-12, and NO primarily through VPAC1, expressed constitutively, and, to a lesser degree, through inducible VPAC2. VIP's binding to VPAC1 induces both a cAMP-dependent and a cAMP-independent pathway that regulates cytokine production and NO at the transcriptional level. VIP inhibits the expression of TNF-α, IL-12, and inducible nitric oxide synthase (iNOS) by reducing the binding of the NFκB transcription factor to the promoter and increasing IL-10 by increasing the binding of the CREB factor [97]. Thus, molecular mechanisms and transcription factors involved in the VIP signaling during inflammatory responses include inhibition of interferon (IFN)-γ-induced Jak1/Jak2 phosphorylation and STAT1 activation, inhibition of different MAPK cascades, inhibition of IkB-kinase, and stimulation of CREB factor [3,97] (Figure 1).

**Figure 1.** VIP/VPAC receptors' axis signaling pathways. VIP/VPAC receptors' binding activates a cAMP-dependent signaling pathway mediated by the induction of AC (black arrows). Then, cAMP activates PKA, which in turn induces nuclear translocation of CREB (black arrows). Besides, PKA inhibits the activation of pro-inflammatory transcription factors such as AP-1, IRF, or NFκB (black cross). Additionally, cAMP in a PKA-independent way simulates EPAC (black arrows). This second messenger induces anti-inflammatory transcription factors. VPAC receptors can also activate PLC and PI3K (black arrows). Both signaling pathways produce the nuclear translocation of anti-inflammatory transcription factors. These receptors can also interact with accessory RAMPs, modulating the canonical signaling pathways (dark green arrow). Furthermore, VPAC1 is able to translocate to the nucleus by the interaction with GRKs (blue arrows). Inflammatory stimuli activate signaling pathways (light green arrows).

VIP also modulates inflammatory responses through the regulation of different functions of other cells, including mast cells, microglia, dendritic cells, and synovial fibroblasts [98–101]. Moreover, in terms of adaptive immunity, VIP reduces pro-inflammatory Th1 and Th17 responses, as described below.

The importance of endogenous VIP in the regulation of inflammation and autoimmunity has been confirmed in knockout (KO) mouse models showing altered immune responses. At basal conditions, the immune phenotype of the mice studied so far is relatively mild. The role of VIP is mainly highlighted in challenging inflammatory conditions. Thus, VIP-deficient mice develop lung inflammation [102,103]. However, there are discrepancies about the resistance or susceptibility of VIP-KO mice to endotoxemia. Hamidi et al. described an increased susceptibility to death from endotoxemia, while Abad et al. found that VIP-KO mice exhibit resistance to endotoxic shock and decreased pro-inflammatory responses due, in part, to the presence of an intrinsic defect in the responsiveness of inflammatory cells in the chronic absence of VIP, suggesting that these mice may exhibit a defect in the innate arm of the immune system [104,105].

Given the accumulated evidence of VIP anti-inflammatory properties, VIP treatment has been reported to protect against septic shock and various inflammatory and autoimmune diseases, and to act as a survival factor against injury of lung and neuronal cells [45,106–110]. The role of VIP in the inflammatory component of the diseases described in this review is treated in detail in the different pathologies.

#### *4.2. Modulating the Expression of TLRs*

Toll-like receptors (TLR) are a large family of type I transmembrane proteins belonging to pattern-recognition receptors, which are specialized in the recognition of extracellular and endosomal pathogen-associated molecular patterns, serving as warning signals for the immune system. Likewise, TLRs are able to specifically bind damage-associated molecular patterns, which are associated with tissue damage, cell stress, and cell death [111–113]. Therefore, TLRs are defined as essential receptors to trigger innate immune response and, subsequently, for the regulation of the adaptive immune response [114]. The expression of these receptors has been detected not only in immune cells, including macrophages, neutrophils, mast cells, dendritic cells, and T and B lymphocytes [115–117], but also in non-immune cells, such as synovial fibroblasts, keratinocytes, pulmonary, and intestinal epithelial cells [118–121]. In humans, these transmembrane receptors are found both on cell membranes (TLR1, 2, 4, 5, and 6) and in endosomes (TLR3, 7, 8, and 9) [111,122].

Upon ligand binding to TLRs, complex signal transduction cascades are triggered, requiring different adapter proteins. Myeloid differentiation primary response protein (MyD88) is involved in signaling by all TLRs, with the exception of TLR3 [123,124]. Both MyD88-dependent and -independent pathways lead to activation of NFκB, IRF3/7, and/or AP-1, which ultimately induce the production of inflammatory mediators, and co-stimulatory molecules [111,113,125]. Thus, an inappropriate or deregulated TLR activation, such as a persistent infection or a failure in their ability to discriminate self from non-self molecules can compromise immunological homeostasis [114,126]. In fact, numerous studies have demonstrated the involvement of TLRs in a wide variety of pathological processes, including both acute and chronic infections, as well as in the induction, progression, or exacerbation of many systemic autoimmune and/or inflammatory conditions [113,127–129]. In this regard, extensive data accumulated from animal models and in vitro human studies have strongly demonstrated the homeostatic effects of VIP on the deregulated expression and signaling of TLR in a context of inflammatory and/or autoimmune disease [130–133].

TLR modulation by VIP was described for the first time in the trinitrobenzene sulfonic acid (TNBS)-induced colitis mouse model, which mimics human Crohn's disease (CD). In that model, VIP reduces the upregulated expression of TLR2 and TLR4, as we describe in Section 5, "Protective Effects of VIP in Inflammatory/Autoimmune Diseases" [134,135]. The inhibitory effect of VIP on TLR2 expression was suggested to be due to its ability to prevent the nuclear translocation of NFκB, which has a binding site in the murine TLR2 gene [85,136]. Moreover, research on primary murine macrophages and the RAW 264.7 cell line showed that VIP exerts its suppressive effects on murine TLR4 expression at the transcriptional level by decreasing the binding of the transcription factor PU.1 via PI3K/Akt1 pathway [137]. In agreement with these findings, VIP was also able to reduce the lipopolysaccharide (LPS)-induced expression of TLR2 and TLR4 in human monocytic THP1 cells and peripheral blood monocytes, as well as to inhibit their differentiation to macrophages [138]. In these cells, VIP inhibited the nuclear translocation of PU.1, which acts as a transcriptional regulator of both TLR2 and TLR4 genes in humans [139,140].

The potent immunomodulatory effect of VIP on TLRs has also been reported in the mice cornea after *Pseudomonas aeruginosa* infection. Data from in vitro studies demonstrated that VIP reduced LPS-stimulated expression of TLR1, TLR4, TLR6, TLR8, and TLR9 in macrophages and Langerhans cells [141].

The effects of VIP on TLRs were also assessed in SF from OA and RA patients. TLR2, TLR4, and TLR3 expression are described to be higher in RA-SF compared with OA, whereas greater levels of TLR7 have been detected in OA-SF [82,142,143]. In vitro data indicated that VIP treatment decreases both LPS- and TNF-induced expression of TLR4 in RA-SF, whereas it has no effect on the elevated constitutive expression of TLR2 and TLR4 [142,144]. Furthermore, VIP also exerts a negative modulation of TLR4 signaling in these cells by the downregulation of important molecules of both the MyD88-dependent and -independent signaling pathways [83]. On the other hand, no effect of VIP on the expression of other TLRs has been detected in OA and RA-SF. However, VIP exerts an

inhibitory activity on nuclear translocation of transcription factors activated by TLR3 and TLR7, with the subsequent reduction of antiviral, pro-inflammatory, and joint destruction mediators upregulated by engagement with these receptors [82,142].

All in all, VIP's ability to balance TLR expression and signaling may be of physiological relevance in the specific control of innate and adaptive immune responses.

#### *4.3. Regulating Th Cells*

Th cells are specialist cytokine-producing cells that modulate the adaptive immune response. During inflammation or infection, different Th subsets are activated, playing a fundamental role in the type of response and the degree of amplification. These subsets are classified by their cytokine profile and the expression of specific transcription factors (master regulators) that direct their functional activity [145]. Thus, Th subpopulations are organized into two branches, effector Th cells and regulatory T cells (Treg). Th1, Th2, Th17, Th follicular (Tfh), Th9, and Th22 subsets are found within the branch of effector Th cells. In an immune steady-state, the balance between these subsets underwrites the preservation of immune tolerance. When a microbial or viral infection or tissue damage occurs, this balance changes from a tolerant state to an immunogenic/inflammatory state, until the immunogen is eliminated. Then the homeostatic regulatory mechanisms are recovered, and the system returns to its initial state. Inflammatory and/or autoimmune diseases occur when these mechanisms fail [146]. Some of the Th subpopulations play a key role in these pathologies; for example, Th1 and Th17 are the key effector Th cells in RA and Crohn's disease, while a loss in the number or function of the Treg has been described. Not only is the level of presence of each of the subsets important in the development of autoimmune diseases, but so is the plasticity observed between them or even their heterogeneity [147–150]. In this sense, pathogenic Th17 can change its lineage commitment to a Th1 profile, called nonclassical Th1 or ex-Th17. This has been observed in different mouse models of autoimmune diseases or in RA patients [151–154]. Th plasticity is also observed in Treg, which can shift its linage commitment to Th1 or Th17. In turn, nonpathogenic Th17 can acquire a Treg profile [150]. Therapeutically, it is important to know the involvement of each subpopulation in the different pathologies, as well as their possible plasticity or heterogeneity.

VIP is a microenvironment mediator involved in the generation of diversity and plasticity of Th subsets in inflammatory or autoimmune diseases. This claim is supported by numerous experimental studies in both animal models and ex vivo samples of patients [5,155–158]. VIP was able to decrease the cytokine profile and master regulators related to Th1 and Th17 subsets and to increase those of them related to Treg or Th2 in different autoimmunity animal models, such as the collagen-induced (CIA) arthritis mouse model of RA, the TNBS mouse model of Crohn's disease, the nonobese diabetic (NOD) mouse model of autoimmune diabetes, the experimental autoimmune encephalomyelitis (EAE) mouse model of multiple sclerosis, the experimental model of autoimmune myocarditis, and the pristine-induced lupus model of lupus nephritis [7,11,159–165]. In addition, this immunomodulatory role of VIP was observed in two inflammatory animal models, including the models of CNS inflammation or atherosclerosis [108,164]. The same effect was observed in mouse Th cells activated in vitro studies or with Th lymphocytes from patients activated ex vivo, mainly in studies with RA patients, treated with exogenous VIP [153,157,166,167]. VIP not only acts on a specific subset in these pathologies, but is also able to balance the different Th subsets, inducing nonpathogenic phenotypes or modify their plasticity. Studies on different transcription factors, cytokines, cytokine receptors, chemokines, and chemokine receptors in the above mentioned mice models, in vitro or ex vivo, showed that VIP counterbalances the ratio of Th1/Th2, Th17/Treg, Th1/Treg, or Th2/Th9, reducing pathogenicity and increasing tolerance [10,165,168]. Th17 cells are a heterogeneous subset with a nonpathogenic or pathogenic profile, depending on the microenvironment. VIP maintains the nonpathogenic profile of human Th17-polarized cells in vitro from naïve Th cells [169]. Indeed, it lowers the pathogenic Th17 profile in activated/expended memory Th cell ex vivo from early RA patients [153,170]. Taking into consideration the plasticity of Th subsets, this neuropeptide decreases

the Th17/1 profile, inducing a negative correlation between Th17 and Th1 in ex vivo cultured cells from early RA patients, but also increases the Th17/Treg profile [153,169,170]. The effect of VIP on the plasticity of Th17 cells is in agreement with its effect on heterogeneity, since the nonpathogenic Th17 phenotype is closely related to Th17/Treg plasticity.

In summary, the generation/differentiation, plasticity, and heterogeneity of Th subsets are crucial events during the development of inflammatory/autoimmune diseases. These processes are susceptible to modulation by different mediators present in the microenvironment of Th cells, an example of which is the VIP neuropeptide that induces a less pathogenic and more tolerogenic response in Th cells.

#### *4.4. Inducing Tolerogenic Dendritic Cells*

Conventional or classical dendritic cells (DCs) are critical for initiating the activation and differentiation of T cells during an inflammatory state, mainly due to their co-stimulatory capacity. They can be classified functionally according to their maturation state in immature or mature DCs [146]. Immature DCs, also called lymphoid organ-resident DCs, are phenotypically immature since they show on their surface low amounts of costimulatory receptors. When they migrate, they initiate a maturation process by strongly expressing these receptors. During the maturation process of DCs, they can differentiate into tolerogenic or immunogenic antigen-presenting cells, each distinguished by specific cytokine production and cell-surface receptors. Immunogenic DCs develop an immunogenic/inflammatory state, whereas mature tolerogenic DCs can induce immune tolerance [146,171]. These latter cells are prompted by either anti-inflammatory signals or signals interfering with the function of immunogenic DCs. Their role is to inhibit effector and autoreactive T cells and trigger Treg development. As a consequence, they play a main role in inducing immune tolerance, resolution of ongoing immune responses, and prevention of autoimmunity [172,173].

An increasing body of data indicates that tolerogenic DCs could be promising therapeutic targets in the treatment of autoimmune diseases [172,174]. One of the approaches is to generate ex vivo tolerogenic DCs for DC-based immunotherapy [175,176]. In this sense, VIP-treated DCs retained their tolerogenic ability in vitro and in vivo under different inflammatory situations [45,177,178]. Two strategies have been followed to generate VIP-tolerogenic DCs: VIP treatment during differentiation of DCs derived from bone marrow or monocytes, or using lenti-VIP transduced DCs [6,179]. In either case, the later administration in vivo of these cells produces Ag-specific Treg capable of inducing specific tolerance to naïve recipients. In this way, they cause the attenuation of symptoms of different animal models of autoimmune and/or inflammatory diseases, for example, in CIA arthritis, TNBS-induced colitis, EAE, sepsis, and spontaneous autoimmune peripheral polyneuropathy [177,179–182]. In addition, in vitro studies with VIP have shown that it affects not only the phenotypic and functional maturation of DCs, but also the migration of these cells [183–185].

#### **5. Protective E**ff**ects of VIP in Inflammatory**/**Autoimmune Diseases**

#### *5.1. Rheumatic Diseases*

#### 5.1.1. VIP in Rheumatoid Arthritis

RA is a systemic inflammatory rheumatic disease of unknown etiology, with a significant autoimmune component, characterized by a persistent synovitis of symmetrical peripheral joints and the presence of auto-antibodies such as rheumatoid factor and anti-citrullinated protein antibodies (ACPA) [186–188]. The natural course of RA is generally associated with progressive destruction of articular cartilage and bone, resulting in a severe functional impairment and serious worsening of the patient's quality of life. However, RA is described as a heterogeneous disease with several subtypes that differ in clinical symptoms, such as age of onset, rate of progression, disease severity, and outcome [187,189,190]. Its complexity is also reflected in the fact that it is considered a multifactorial disease, as genetic background and environmental conditions, including infectious events and dysbiosis

in the gut and the lung microbiome, have been indicated as factors involved in triggering the aberrant immune response [186,187].

Although RA pathogenesis is not completely understood, it is widely accepted that local and systemic immune dysregulation, as a result of imbalance in the Th cell subsets, plays an important role in creating a synovial joint microenvironment that favors a hyperactivated phenotype of SF and macrophages. Indeed, both cell types are thought to be central to disease progression by mediating synovial hyperplasia and the release of pro-inflammatory cytokines and tissue damaging enzymes [187,191–194]. In RA, resident synovial cells and both adaptive and innate immune cells establish a positive feed-forward activation loop mediated by pro-inflammatory cytokines such as TNF-α, IL-1β, IL-6, and IL-12, which perpetuate the disease and ultimately lead to joint destruction [195–198].

Experimental evidence accumulated over the last two decades has demonstrated beneficial effects of VIP in all stages of RA development through its anti-inflammatory and immunomodulatory abilities [131,158,199] (Figure 2). In addition, numerous studies have shown the direct antimicrobial activity of VIP against a wide range of bacteria [200], as well as its protective effect in polymicrobial sepsis [201]. Interestingly, VIP is able to counteract the effects of LPS from *Porphyromonas gingivalis* in monocytes, which is an oral bacterium related to increased risk of arthritis associated with periodontal disease [202,203].

**Figure 2.** Biological effects of VIP in rheumatoid arthritis. Schematic representation of an RA joint. Green arrows indicate "induction", whereas red arrows indicate "inhibition".

Initial data about the anti-inflammatory properties and therapeutic potential of VIP in the context of RA were obtained in the CIA mouse model [11,199]. Exogenous administration of VIP was able to reduce the incidence and severity of arthritis in mice, inducing a dramatic decrease in cartilage and bone erosion. In that model, VIP has been demonstrated to modulate the subsets of Th lymphocytes by promoting a Th2-type response while expanding CD4<sup>+</sup> CD25<sup>+</sup> Treg [11,204]. In line with these findings, other studies in the same arthritis mouse model showed protective effects of VIP on bone destruction by modulating the RANK/RANKL/OPG system through the downregulation of the Th17 response and subsequent increase of the Treg/Th17 ratio [84,159,168]. Moreover, an inhibitory action of VIP on osteoclastogenesis has been described in CIA mice, exerting a direct effect on osteoclast progenitor cells purified from bone marrow, as well as through its modulatory action on stromal and osteoblast cells [84,205].

In light of the anti-inflammatory and immunomodulatory effects of VIP in the CIA model detailed above, several studies assessed the role of this neuropeptide in the context of human RA. Accordingly to the murine model, in vitro studies on human synovial fibroblasts, macrophages, peripheral blood lymphocytes, and Th cells from patients with RA confirmed the ability of VIP to regulate components of both innate and adaptive immune responses [158].

In brief, VIP has been shown to significantly attenuate the basal and TNF-α-induced production of pro-inflammatory chemokines and IL-6 in both synovial tissue suspensions and SF from RA patients [98]. Interestingly, such anti-inflammatory effects were later fully reproduced in cultured RA-SF by specific VPAC2 agonists, according to the dominant presence of that receptor described in these cells [38].

Subsequent studies on RA-SF also proved an inhibitory effect of VIP on the expression and signal transduction of some PRRs, which are linked to the pathogenic activation of these synovial cells [193,197] as previously explained. Moreover, VIP is able to downregulate the enhanced expression of the IL-22 specific receptor, preventing the IL-22 stimulatory effects on proliferation and production of matrix metalloproteinase-1 (MMP-1) and S100A8/A9 alarmins involved in RA-SF mediated joint destruction [206]. Likewise, it has been described that VIP counteracts the stimulatory effect of pro-inflammatory mediators, including TLR3 and TLR4 ligands, TNF-α, and IL-17, on the expression of IL-17 receptors and the IL-12 family of cytokines in RA-SF, which, in turn, mediates their cross-talk with Th1/Th17 cells [207]. Along with the anti-inflammatory effects of VIP in RA-SF through its action on TLR, this neuropeptide has been described to decrease the pro-inflammatory peptides corticotropin releasing factor (CRF) and urocortin (UCN)-1, while increasing the expression of the potential anti-inflammatory agents UCN-3 and CRF receptor 2 (CRFR2). Moreover, VIP is able to inhibit CREB activation, cyclooxygenase 2 expression, and prostaglandin 2 (PGE2) secretion in RA-SF [208].

In line with these findings, the potent anti-inflammatory role for VIP on cellular components of the immune system in the context of RA has also been validated by in vitro studies [158]. Upregulated levels of pro-inflammatory mediators, including TNF-α, IL-6, and CXCL8 and CCL2 chemokines, in polyclonally stimulated peripheral blood lymphocytes from RA patients were reduced after treatment with VIP [167]. Furthermore, regarding its effects on macrophages, VIP was able to impair the acquisition of the pro-inflammatory polarization profile described for macrophages in RA synovium, favoring instead an anti-inflammatory phenotype [32]. Additionally, the involvement of VIP in the modulation of Th subsets has been extensively studied, as previously detailed.

Apart from the effects of VIP treatment in animal models and in cultured cells from RA patients, recent studies have focused on evaluating the potential value of endogenous VIP as a biomarker in RA, as we discuss later.

#### 5.1.2. VIP in Osteoarthritis

OA is a chronic rheumatic disease and is considered the most prevalent in developed countries and the main cause of incapacity in the elderly population. It is a complex multifactorial disease and is the clinical endpoint of heterogeneous disorders with common clinical, pathological, and radiological characteristics, resulting in the alteration of one or more joints [209–213]. Although it is usually an age-related disease, OA is also associated with other multiple risk factors that culminate in joint dysfunction, including genetic predisposition, epigenetic factors, gender, obesity, exercise, work-related injury, and trauma [209,214–216].

OA is characterized by cell stress and extracellular-matrix (ECM) degradation, resulting in an imbalance in joint-tissue metabolism, which culminates in a progressive loss of synovial joint function, with pain and disability. While cartilage degradation is the main event, the view of OA as solely a pathology of cartilage has changed in recent years. This pathology affects the whole joint, resulting in the remodeling of adjacent subchondral bone, osteophyte formation, and synovial inflammation [213,217–226]. Although OA has an important mechanical component, it is currently also considered as a low-grade inflammatory disease. The biological imbalance and the mechanical stress lead to a pathological situation, with altered chondrocyte behavior, which results in the release of inflammatory mediators and ECM-degrading enzymes [19–22]. All of these factors, along with the inhibition of cartilage biosynthesis, increase the fragility and loss of cartilage integrity [23]. Although synovitis is usually localized and may be asymptomatic in OA [24], synovial activation causes the release of inflammatory mediators and proteases that accelerate the progression of the disease [2,25,26]. Moreover, the subchondral bone is also affected and is involved in the progression of OA through the release of catabolic mediators that promote an altered metabolism in chondrocytes [14,27].

The majority of available therapies for OA focus on relieving symptoms rather than slowing the progression of the disease. Therefore, it is important to find new therapeutic targets for the development of new drugs to treat the disease [227,228].

While the association of VIP with RA has been widely studied, its role in OA is not well established, although it is the second rheumatic pathology in which more advances have been obtained in the study of the VIP function [229] (Figure 3). Less is known about the role of VIP in other disorders such as systemic lupus erythematosus or spondyloarthritis (SpA). The effects of VIP reported in rheumatic diseases could be mediated in part by its action on the SF, as has been described in several in vitro studies [82,98,142,207,230]. OA-SF expresses and releases VIP, with a greater expression than RA-SF [38]. However, its expression is decreased in the synovial fluid and cartilage of OA patients compared to healthy controls, which could contribute to the pathology [231,232]. Regarding VIP receptors, both VPACs are detected in OA-SF with a greater expression of VPAC1. Pro-inflammatory mediators released to the joint microenvironment during the disease, such as TNF-α, decrease the expression of VIP, and modulate the VPAC1/VPAC2 ratio, therefore approaching its profile to that of RA-SF [38].

VIP is able to counteract the action of pro-inflammatory mediators, alleviating the inflammation and the pain in OA. VIP reduces the serum levels of TNF-α and IL-2 and increases serum IL-4 in a rat model of knee OA. In this model, VIP also inhibits proliferation of OA-SF and decreases the production of TNF-α, IL-2, MMP-13, and ADAMTS-5 (a disintegrin and metalloproteinase with thrombospondin motifs-5), at the same time that it induces the expression of type II collagen and osteoprotegerin, by inhibition of NFκB signaling [232,233]. In addition, VIP modulates the corticotropin-releasing factor family of neuropeptides, also increasing the expression of the potential anti-inflammatory mediators UCN-2 and -3, as well as CRFR2 in OA-SF. Moreover, VIP increases cAMP and induces CREB activation in OA-SF [208], which would support its anti-inflammatory role through the inhibition of other signaling pathways, involving JNK-MAPK, NFκB, or c-Jun, inhibiting the production of pro-inflammatory mediators and promoting the expression of anti-inflammatory cytokines [85,234–236].

On the other hand, some studies reported that the accumulation of VIP in joints can also contribute to the pathogenesis of OA. Thus, VIP treatment in rat OA knees promotes synovial hyperemia, as well as sensitization of joint afferent fibers via AC/cAMP/PKA, also increasing firing rate and decreasing mechanical threshold during movement. Therefore, VIP might promote mechanosensitivity and pain in rat OA models [37,225,232,237,238]. Moreover, Rahman et al. reported that VIP stimulates PGE2 production in human articular chondrocytes, human osteoblast-like cells, and human SF, as well as cAMP production in human osteoblast-like cells, suggesting a pro-inflammatory role for this peptide [239]. Another study also related increased VIP levels in the synovial fluid to the presence of synovitis in OA patients [240], suggesting that both downregulation and upregulation of VIP could contribute to the OA pathology [232].

**Figure 3.** VIP effects in OA. Schematic overview of VIP effects in OA. Green arrows indicate "induction" whereas red arrows indicate "inhibition".

In addition to the inflammatory process, ECM degradation and cartilage loss is a key factor in the OA pathology. In this regard, VIP might prevent cartilage damage, since VIP modulates the profile of ECM-degrading enzymes released to the joint microenvironment by SF from OA patients. Thus, VIP decreases the expression and activity of the proteinase urokinase-type plasminogen activator (uPA), as well as the production of its receptor (uPAR), after stimulation with the pro-inflammatory cytokine IL-1β or the degradative mediators 45kDa fibronectin-fragments (Fn-fs). On the other hand, VIP induces the production of the plasminogen activator inhibitor-1 (PAI-1) under basal conditions in these cells. Furthermore, VIP reduces the production of MMP-9 in IL-1β- or Fn-fs-stimulated OA-SF, as well as MMP-13, the main proteinase involved in the degradation of type II collagen, after stimulation with Fn-fs [220]. In addition, VIP decreases the production of ADAMTS in OA-SF, including the aggrecanases ADAMTS-4 and -5, key proteinases in the degradation of aggrecan from the cartilage ECM, after IL-1β or Fn-fs stimulation, as well as the cartilage oligomeric matrix protein (COMP)-degrading ADAMTS, ADAMTS-7 after both stimuli, and -12 after Fn-fs treatment. In this sense, VIP also reduces COMP degradation from cartilage explants cultured with IL-1β- or Fn-fs-stimulated OA-SF, as well as the aggrecanase activity and glycosaminoglycans (GAGs) release only after Fn-fs stimulation. Moreover, VIP inhibits the activation of Runx2 transcription factor and Wnt/β-catenin signaling involved in ECM remodeling and proteinase expression, after the stimulation of these cells with both stimuli [241].

Few studies have focused on the presence of VIP and its receptors in chondrocytes [37]. As previously described in SF, articular cartilage from OA patients also has lower VIP levels compared to controls. Moreover, VIP expression in synovial fluid is positively correlated to its optical density in articular cartilage [231]. Juhász et al. described the expression of VPAC1, VPAC2, and PAC1 in chicken chondrogenic cells [242,243].

Concerning subchondral bone, VIP receptors have been described on osteoclasts and osteoblasts of several species, including human, mouse, and rat [37,244]. VIP inhibits osteoclast-mediated bone resorption and induces the production of IL-6 from osteoblasts, regulates the expression of osteoclastogenic factors like RANKL and OPG in osteoblasts, and seems to be involved in osteoblastogenesis [37,242,245,246]. In addition, VIP promotes osteoblast activity and proliferation and stimulates bone remodeling [244,247]. Furthermore, Xiao et al. showed higher VIP levels in the femoral bone from OA postmenopausal women compared to those with osteoporosis, where VIP was also positively associated with pain [248].

Recent studies also associate VIP levels to the progression of rheumatic diseases. In this regard, VIP levels in synovial fluid and cartilage of OA patients are negatively associated with progressive joint damage, being a potential indicator of disease severity [231]. In addition, VIP could be postulated as a potential therapeutic target in OA, since it is involved in the activation of several anabolic signaling pathways in the synovial joint [245].

#### *5.2. Inflammatory Bowel Disease*

Inflammatory bowel disease is the prevailing gut autoimmune disorder and comprises Crohn's disease and ulcerative colitis (UC). As with other autoimmune diseases, the origin is multifactorial and comprises genetic, environmental, and host-related factors that affect the development of bowel inflammation [249].

UC lesions are located within the colon, while CD is a relapsing remitting granulomatous disease, which can affect any segment of the digestive tract, producing transmural lesions. Although IBD pathogenesis is unclear, an atypical immune response to intestinal microbial products and/or food allergens represents an important causal factor. Moreover, interactions between the enteric nervous system (ENS) and the immune system play an important role in its pathophysiology [250,251]. These communications include the secretion of neuropeptides, which conduct signals bidirectionally between enteric neurons and immune effectors [252]. VIP and its receptors are expressed in the gastrointestinal tract to perform its anti-inflammatory/immunomodulatory action. The source of endogenous VIP in the gut could be of nervous origin, or from lymphoid cells. Concerning receptors, as we previously described, VPAC receptors are expressed in monocytes, macrophages, and T and B cells, as well as in myeloid cells, such as mast and polymorphonuclear cells.

To date, different models of chemically induced IBD have been characterized, showing several clinical, histological, and immune-response characteristic of UC and CD: the Dextran Sodium Sulfate (DSS), the oxazolone-induced colitis, TNBS, and Dinitrobenzene sulfonic acid (DNBS). The murine model has benefits, as well as limitations, in some characteristics of their clinical, immunological, and histopathological relevance to IBD. Administration of 3–10% DSS in the drinking water is one of the most common chemical methods used to induce colitis in rodents [253]. The oxazolone-induced colitis represents a model of Th2-driven inflammation. In this model, colitis is induced by intracolonic instillation of the haptenating agent oxazolone dissolved in ethanol after a skin pre-sensitization step [254]. However, there is limited information about the time course and cytokine profile of the immune response involved in this model of colitis.

Other models of colitis include the hapten-induced DNBS or TNBS that are administered by rectal instillation diluted in ethanol. The haptenization of host proteins induced infiltration of neutrophils, macrophages, and Th1 lymphocytes in the injured mucosa. In comparison to DNBS, TNBS is considered to be a hazardous chemical due to its highly oxidative properties. Since it was developed more recently, research using the DNBS-induced model is less common [255].

Gut inflammation and a differential expression profile of cytokines are key properties of their immune response. The TNBS colitis model develops with elevated Th1–Th17 response (increased IL-12 and IL-17), while DSS colitis switches from a Th1–Th17-mediated acute inflammation (increased TNF-α, IL6, and IL-17) to a central Th2-mediated inflammatory response (increment in IL-4 and IL-10 and associated reduction in TNF-α, IL6, and IL-17) [256]. This dissimilar cytokine profile has been used to establish an equivalence with human IBD. Thus, TNBS colitis mimics CD, while chronic DSS-colitis mimics UC [257].

The first report on the role of VIP as a therapeutic agent in IBD was published in 2003, in the TNBS colitis model [7]. VIP treatment reduced the clinical and histopathologic severity of TNBS-induced colitis, abolishing body-weight loss, diarrhea, and macroscopic and microscopic gut inflammation. The VIP effect is mediated by both innate and acquired immune responses. Regarding innate immunity, administration of VIP in the TNBS model decreased myeloperoxidase activity in colon extracts, a specific marker of neutrophils, and reduced the expression of receptors involved in neutrophil recruitment, such as CXCR1 and CXCR2 [7,156,258]. CD4<sup>+</sup> T helper cells are major initiators of IBD. CD4<sup>+</sup> T cells are enriched in the gut of patients with CD and UC and blockade or reduction of CD4<sup>+</sup> T is effective in treating patients with IBD [259].

Th1 and Th17 subsets are important players in the development of CD [260]. In the TNBS model, we reported that VIP reduced IFN-γ and TNF-α enhancing IL-4 and IL-10 levels in colon and cell cultures from splenocytes and lamina propria immune cells, thus promoting Th2 vs Th1 responses. VIP diminished IL-17, IL-21, and IL-17R mRNA expression in the colon, supporting an inhibitory action over the Th17 cell subpopulation. Interestingly, VIP increased the Foxp3 and transforming growth factor (TGF)-β mRNA expression in CD4<sup>+</sup> cells from mesenteric lymph nodes, as well as the IL-10 expression in the colon upregulating Treg responses [7,156,258]. Finally, VIP also reduced the TNBS-induced numbers of TCD4 lymphocytes, whereas it induced an increase in the number of B-lymphocytes (CD19+) in mesenteric lymph nodes.

To date, different mechanisms involved in the therapeutic effect of VIP have been described. One of the first actions described was the VIP modulation of TLR. Among the environmental factors, the modification of gut microbiota or dysbiosis has been reported as a key element in the development of IBD [261].

Additionally, the receptors of the innate immune system, TLRs, affect many aspects of IBD etiology, including immune responses and microbiota. Differential expression of TLRs in IBD patients in comparison with healthy donors has been characterized. Modification of TLR expression or signaling has been reported, not only in experimental models of IBD in mice, but also in human IBD. Most TLR signaling pathways participate in the development of IBD and are sometimes beneficial and other times harmful [262]. Nevertheless, much of the evidence has indicated that the TLR2 and TLR4 signaling pathway has a negative role in IBD. It was reported that the inhibition of TLR2–TLR6/1 activity ameliorated DSS-induced colitis. In healthy patients, TLR4 is expressed at a low level in intestinal epithelial cells; however, its expression was upregulated in the intestinal epithelia of patients with active UC, suggesting that TLR4 could be involved in UC disease development [262].

In the TNBS-induced colitis model, VIP treatment exerts a time-course inhibition of TLR2 and TLR4 expression in colon epithelial and mononuclear cells. Moreover, VIP acts at a systemic level in lymph nodes. Mesenteric lymph nodes are the draining nodes of the intestinal tract that regulate the traffic of lymphoid cells. VIP inhibits the TNBS-induced TLR2 and TLR4 overexpression in macrophages, dendritic cells and the lymphocyte subpopulations, T CD4+, T CD8+, and B CD19<sup>+</sup> [134,135]. The peptide also enhances the expression of Foxp3 and TGF-β, which are both involved in regulatory T-cell function. Overall, we reported that, after specific stimulation of TLR2 and TLR4, VIP exerts homeostatic function, balancing innate and adaptive immune responses in the murine model of CD, both locally in the colon and at the periphery in lymphoid nodes [131,156,263,264].

Another study using the TNBS-induced colitis model reported that treatment with VIP did not modify the clinical and histological parameters [265]. However, another recent study confirmed our results. Because the nanocarrier sterically stabilized micelles (SSM) protect peptides from enzymatic degradation, ameliorating their bioavailability and half-life, Jayawardena et al. developed sterically stabilized micelles of VIP (VIP-SSM). They characterized the healthy role of VIP and VIP-SSM in the DSS-induced colitis model. At clinical and histological levels, VIP and nanoparticles of VIP treatment decreased the pro-inflammatory cytokine profile in the colon, reducing tight junction and ion-transporter protein expression associated with severe DSS colitis [266].

It is also important to note that VIP has shown beneficial effects in other models of colitis, such as colitis induced by *Citrobacter rodentium* [267] and the oxazolone-induced colitis [268].

Results are variable in knockout mice of the VIP/VPAC receptor axis, [269]. The VPAC2 receptor KO mice showed worse progression of DSS-induced colitis, whereas VPAC1 knockout DSS-induced colitis in VPAC1-KO mice was resistant to colitis [270]. Concerning VIP knockout mice, the results using the chemical-induced colitis models are contradictory. Thus, DNBS and DSS-induced colitis were more severe in VIP-KO than wild-type mice. VIP treatment recovered the phenotype, protecting VIP-KO mice against DSS colitis. Moreover, VIP is beneficial for the development and maintenance of a colonic epithelial barrier structure under physiological conditions and promoting epithelial repair and homeostasis during colitis [271]. Abad et al. reported in the TNBS model that mice lacking VIP developed reduced colitis [272]. These discordant results using the KO model of the VIP axis could be explained by the presence of differential microbiota, by alterations in the development of the chemical-induced model of colitis or by the existence of compensatory mechanisms in VIP-KO, by PACAP, a related peptide, or by another mediator.

Despite the scarce dissimilar results about the effect of VIP in the TNBS model of CD and those obtained with the KO mice, the conclusive results are robust and are summarized in Figure 4.

**Figure 4.** Biological effects of VIP in the TNBS-induced murine model of Crohn's disease. Main effects of VIP on disease's development are represented schematically. Green arrows indicate "induction", whereas red arrows indicate "inhibition".

Data on the role of VIP in humans are scarce and are relative to the presence of the peptide in health conditions and in IBD patients. Contradictory results about alterations in gut VIP innervation in IBD patients have been reported. Several studies described enhanced VIP expression in the intestine in IBD patients [273]. Moreover, an increase in VIP immunoreactivity both in nerve fibers and neurons were characterized in CD patients [264]. Conversely, other studies have characterized a reduction in the abundance of VIP-immunoreactive nerve fibers in the lamina propria and submucosa in both CD and UC patients. Remarkably, the variation in the decrease was significantly related to the severity of the disease [274]. In general, it is well recognized that a broad loss of mucosal neuropeptidic innervation may be related to areas of high inflammation; thus, the contradictory results could be explained by the experimental conditions.

In a recent elegant study, Sun et al. described the beneficial role of VIP in UC patients. In agreement with data reported in several rheumatic diseases [245,275,276], they found that serum VIP levels are lower in UC patients than in healthy controls. The study provided evidence showing that VIP serum levels are lower in IgE<sup>+</sup> UC patients than that in IgE¯ UC patients [268].

The same authors found that in the regulatory B cells (Bregs) from peripheral blood of UC patients, immune suppressive function is impaired, probably due to lower serum VIP levels and lower IL-10 expression in Bregs. This expression increases with the presence of VIP, which stabilizes IL-10 expression in Bregs. In brief, they demonstrated that VIP administration restored Breg function, inhibited pro-inflammatory cytokine production, prevented the allergen-specific T-cell response, and reestablished colon tissue structure in experimental colitis. All of these results suggest that VIP is a potential therapeutic agent for UC patients with atypical immune responses to food allergens [268].

To date, no therapy is yet available for the treatment of IBD and combined therapy seems to be the best approach [261]. Despite the contradictory data from KO models, the VIP axis represents a promising candidate for use in combined therapy due to its multistep action on the immune response.

#### *5.3. Central Nervous System Diseases*

The involvement of inflammatory processes and the adaptive immune system in the pathophysiology of neurodegenerative diseases is supported by evidence from a variety of studies [277–280]. Thus, inflammatory responses may be involved in both regenerative and degenerative processes, e.g., in multiple sclerosis and Parkinson's disease (PD). During neuroinflammation, the activation of the glial cells of the brain, mainly microglia and astrocytes, release several factors, many of which are pro-inflammatory, neurotoxic, and damaging to nerve cells.

In vitro and in vivo investigations have described potent neuroprotective features for VIP promoting neural cell proliferation, survival, axon regeneration, and production of neurotrophic factors, as well as inhibition of inflammation [108,281,282]. All of this indicates that the VIP/receptors axis could be a novel therapeutic target in multiple sclerosis and Parkinson's disease.

Multiple sclerosis is a chronic inflammatory autoimmune and neurodegenerative pathology of the CNS that leads to demyelination. Experimental autoimmune encephalomyelitis is the most common animal model for MS sharing many clinical and pathophysiological features resulting in the generation of autoreactive T-cells, eventually culminating in myelin destruction. Like that observed in other inflammatory diseases, treatment with VIP reduced the clinical and pathological scores in EAE with a blockade of symptoms lasting 60 days. These effects were associated with decreased spinal cord levels of pro-inflammatory cytokines (TNF-α, IL-6, IL-1β, IL-18, and IL-12), iNOS, chemokines (CCL5, CCL3, CXCL1, CCL2, and CXCL10), and CC chemokine receptors (CCR-1, CCR-2, and CCR-5) and increased levels of the anti-inflammatory cytokines IL-10, IL-1Ra, and TGF-β [180].

These investigations revealed that VIP treatment decreases the presence of encephalitogenic Th1 cells in the periphery and the CNS. As a consequence, VIP reduces the appearance of inflammatory infiltrates in the CNS, the loss of oligodendrocytes, and the subsequent demyelination and axonal damage typical of EAE [283].

Several studies point to Tregs as key agents in MS and EAE controlling self-reactive cells and inducing the decrease in inflammation [284–287]. Administration of VIP to EAE mice induces the expansion of Treg cells that express CD4<sup>+</sup> CD25<sup>+</sup> Foxp3 and produces IL-10/TGF-β in the periphery and the CNS [288].

In accordance with these results, it has been described that mice with a genetic deletion of the VPAC2 gene exhibit an exacerbation of EAE induced by MOG35-55 compared to wild-type mice, presenting an increased pro-inflammatory cytokine profile (TNF-α, IL-6, IFN-γ, and IL-17) and reduced production of anti-inflammatory cytokines (IL-10, TGF-β, and IL-4) in the CNS and lymph nodes. In addition, the proliferative index and the in vivo suppressor activity of CD4<sup>+</sup> CD25<sup>+</sup> FoxP3<sup>+</sup> Tregs are markedly reduced in KO VPAC2 mice with EAE [289]. These results point toward an important protective role for the VPAC2 receptor against autoimmunity and as an anti-inflammatory mediator [289].

Unexpectedly, and in contrast, Abad et al. found that VIP-KO mice are highly resistant to EAE. This finding was confirmed by histopathology and clinical evaluation. Supporting this phenotype, the levels of multiple pro-inflammatory cytokines in the spinal cord were strikingly reduced in the KO. The authors found that immune cells were trapped in the meninges of the brain and spinal cords and failed to invade the CNS parenchyma, suggesting a defect in immune-cell migration [290]. Clinical disease in these mice was blocked at a step downstream from immunization. Similarly, EAE clinical symptoms are significantly ameliorated in VPAC1-KO mice. The results demonstrate stronger Th1 and Th17 responses, which are known to induce the pathogenesis of EAE, but reduced Th2 responses in these mice. As the phenotype of VPAC1 is opposite that of VPAC2-KO mice, it has been suggested that, in addition to Th polarization, other events are differentially mediated by VPAC1 and VPAC2, and this may depend on different factors, such as their level of expression, the state of cellular activation, the interaction with other mediators present in the microenvironment, such as inflammatory factors, and finally, the disease phase [291].

Studies in patients with MS have reported alterations in components of the VIP/receptors signaling system. Andersen et al. found a reduced VIP immunoreactivity in the cerebrospinal fluid of patients diagnosed with MS [292]. Similarly, Baranowska-Bik et al. observed a tendency toward reduced levels of VIP in the cerebrospinal fluid of multiple sclerosis patients, although this difference was not statistically significant [293]. Interestingly, CD4<sup>+</sup> cells derived from the peripheral blood of patients with MS show a differential expression of the VPAC1 and VPAC2 receptors, compared to those from healthy controls, depending on the activation status of the cells. Without stimulation, similar patterns of VIP receptor expression are detected in CD4<sup>+</sup> cells of subjects with MS and healthy controls with a visible expression of VPAC1 and minimum levels of VPAC2. However, the markedly decreased expression of VPAC1 observed after stimulation and activation of CD4<sup>+</sup> T cells is compensated for by a higher expression of VPAC2 in healthy individuals. Nevertheless, activated CD4<sup>+</sup> T cells from SM patients exhibit an altered expression of VPAC2 as a result of altered gene regulation in the promoter region of the VIP receptor. As a consequence, CD4<sup>+</sup> T cells were less sensitive to VIP and biased the system predominantly in a Th1 direction [294].

Finally, the role of the VIP/VPAC axis has also been investigated in Parkinson's disease. In this progressive degenerative movement disorder, the key roles played by the subsets of CD4<sup>+</sup> cells, especially the Treg whose number or activity is reduced in this pathology, have recently been highlighted. This, together with microglial activation, leads to changes in the microenvironment of the affected brain with oxidative stress, inflammation, and defective protein folding [295–297].

Using murine models of Parkinson's disease, administration of VPAC2 agonists has been shown to increase Treg activity without altering cell numbers, reduce microglial inflammatory responses, increase survival of dopaminergic neurons, and improve striatal densities [63,296].

#### *5.4. Other Autoimmune Disorders*

Type 1 diabetes and Sjögren's syndrome (SS) represent other autoimmune diseases in which the beneficial effects of VIP have been shown. Type I diabetes is an autoimmune disease mediated by T cells associated with the overexpression of inflammatory mediators and the alteration of different subsets of T cells that attack the insulin-producing cells in the pancreas.

Several animal models that develop spontaneous type 1 diabetes have been described, such as NOD mice that exhibit T-cell-mediated insulitis linked to the genes of the major histocompatibility complex. In the NOD mouse model, VIP prevents the increase in the proportion of Th1 to Th17 cells, changes the Tregs/Th17 ratio that leads to tolerance, and reverses the proportion of subsets of Th1/Th2 cells associated with autoimmune pathology. These effects add to the decrease in pro-inflammatory mediators, resulting in a reduction in the destruction of β cells in the pancreas [10].

Studies in KO mice have confirmed the importance of the VIP/VPAC axis in the functionality of the endocrine pancreas. Thus, Martin et al. found that VIP-KO mice exhibit elevated plasma glucose, insulin, and leptin levels [298]. In VPAC2-KO mice, glucose-induced insulin secretion is decreased, with no change in glucose tolerance and mice deficient in VPAC1 show small dysmorphic islets of Langerhans and exhibit impaired neonatal growth that leads to intestinal obstruction and hypoglycemia [299]. Moreover, selective overexpression of the human VIP gene increases glucose-induced insulin secretion in pancreatic β-cells and ameliorates glucose intolerance of 70% depancreatized mice [300].

SS is an autoimmune disease characterized by the infiltration of T lymphocytes at the level of the salivary and lacrimal glands that causes their destruction and the appearance of symptoms related to dry mucous membranes. Recently, the effects of VIP on the immune response and secretory function of the submandibular glands have been investigated by using the NOD model of SS, which develops secretory dysfunction and early loss of glandular homeostatic mechanisms, with mild infiltration in the glands.

Li et al. showed that VIP treatment was able to reduce immune lesions in the exocrine glands and improve the secretory function of these glands by negatively regulating the expression of IL-17A in the exocrine glands. It also improved the secretory function of the exocrine glands by increasing the expression of AQP5, a protein that participates in the transport of water through the glandular epithelium [301].

In the course of salivary function impairment in the NOD mouse model, a progressive decrease in VIP expression in the submandibular glands is observed compared to normal mice. The loss of endogenous VIP is associated with a loss of acinar cells through apoptotic mechanisms that could be further induced by TNF-α and reversed by VIP through a PKA-mediated pathway. The clearance of apoptotic acinar cells by macrophages is impaired by NOD macrophages contributing to the loss of gland homeostasis [302].

Lodde et al. constructed the vector recombinant serotype 2 adeno-associated virus, encoding the human VIP transgene (rAAV2hVIP), to explore its usefulness in SS management. Instillation of rAAV2hVIP in the submandibular glands of NOD mice leads to higher salivary flow rates and increased expression of VIP in the glands and serum, as well as to a reduction of cytokines IL-2, IL-10, and IL-12 (p70) and TNF-α in SMG extracts, and of serum CCL5, compared to the control vector. This work indicates that VIP may be a promising agent for the treatment of the salivary component of SS [9].

Finally, data from humans reveal that monocytes from SS patients show increased expression of VPAC2, which is absent in the monocytes of normal subjects without changes in the expression of VPAC1. This altered expression correlates with an impaired phagocytosis of apoptotic epithelial cells, with reduced engulfment capacity and failure to express an immunosuppressant cytokine profile that is not restored by VIP. This differential expression of VPAC2 associated with phagocytic dysfunction suggests its potential as a functional biomarker in SS [303].

#### **6. VIP as a Therapeutic Agent: Limitations and Perspectives**

The treatment of inflammatory and autoimmune diseases is a challenge. Advances in knowledge of the underlying pathophysiological mechanisms, as well as the discovery of biological therapies against potent mediators of inflammation, have revolutionized the way these diseases are treated. Newly developed molecules aim to diminish the impact of these diseases on the quality of life of patients, although, to date, there is no curative treatment.

Since the discovery of VIP half a century ago, more knowledge about its biology, its signaling mechanisms, and its powerful anti-inflammatory effects, as well as its immunoregulatory capacity, have made it a potential therapeutic agent for diverse diseases. Among these disease are asthma [304], pulmonary hypertension [305], sarcoidosis [306], neurological diseases such as Alzheimer's and

Parkinson's [307,308], inflammatory bowel diseases such as Crohn's [7,264], autoimmune diabetes [10], and cancer [309].

Marketed under the name Aviptadil, VIP has been used in the clinic successfully in the treatment of pulmonary hypertension and sarcoidosis. However, the potential of this peptide at the therapeutic level in clinical practice is still far from its theoretical potential. This is due to its high sensitivity to degradation by proteases, spontaneous hydrolysis, and catalytic antibodies [310,311]. A second limitation for the use of VIP in humans is due to cross-interactions, given their ability to bind to different GPCRs, their functional pleiotropism, and their ubiquity. In addition, systemic administration of VIP with binding to multiple cell targets with high affinity could cause unwanted adverse effects [312,313].

Therefore, strategies have been developed to overcome these difficulties. Thus, distribution systems directed against specific targets that also protect the peptide against its degradation are desirable options. Recent advances in this field include the following: the use of metal nanoparticles, which seem to increase the therapeutic potential of VIP both in terms of target and distribution [314]; the use of modified liposomes with lipopeptides conjugated with VIP, which have demonstrated a selective recognition of VPACs and a more effective antitumor activity in a recent study with human osteosarcoma lines [315], as well as nanomicelles, tested in breast cancer [316].

Interestingly, a single subcutaneous injection of a low dose of camptothecin sterically stabilized micelles conjugated with VIP (CPT-SSM-VIP) administered to mice with collagen-induced arthritis was able to abrogate joint inflammation, with no apparent systemic toxicity and with similar efficacy and safety compared to methotrexate, used clinically for RA treatment [317]. The efficacy of this therapeutic approach has been confirmed in the murine model of colitis induced by DSS. Similarly, the anti-inflammatory and antidiarrheal effects of VIP can be achieved effectively when administered as a nanomedicine.

In relation to the problem posed by VIP's binding to different receptors, stable analogues of the VPAC1 and VPAC2 receptors have been developed recently, based on the technology of the peptidases-resistant foldamers (Longevity Biotech), LBT3627 and LBT3393. LBT-3627 is a VPAC2 selective neuroprotective agent that has been successfully investigated in the preclinical phase in a Parkinson's model (Olson et al. 2015 [63]. The results obtained highlight the therapeutic immunomodulatory potential of this agonist to restore Treg activity, attenuate neuroinflammation, and intercept dopaminergic neurodegeneration in PD, as we mentioned above [296].

Finally, gene therapy with VIP, using lentiviral vectors, has yielded good results in the CIA model [318], and VIP adenoviral vectors have also been developed [9]. However, these approaches continue to lack cellular and tissue specificity. Thus, another possibility under study is cell therapy with dendritic cells transduced with a VIP lentiviral vector (LentiVIP-CDs), whose therapeutic effects in sepsis and EAE models have been very positive with a single local administration [179].

Looking toward the future, despite advances in therapeutic options, there is still a need to continue researching the design and transfer to the clinic of stable VIP analogues and specific VPAC1- and VPAC2-receptor drugs, directed against specific objectives, as well as biomarker field approaches to intervene earlier in the course of the disease.

#### **7. Potential of VIP Axis as a Biomarker for Personalized Treatment in Rheumatic Diseases**

In addition to its potential use as a therapeutic agent, the VIP axis could be used in a second potential translational strategy as a prognostic biomarker. Different studies have described an altered expression in the VIP/VPAC axis in autoimmune diseases and in the modulation of the inflammatory immune response in rheumatic diseases [82,98,153,167]. In juvenile idiopathic arthritis, serum VIP levels are decreased in patients who manifest more disease activity characterized by cardiac autonomic neuropathy associated with parasympathetic dysfunction [319]. In RA, the expression of VPAC1 is decreased in peripheral blood mononuclear cells (PBMCs) [318], and a lower expression of VPAC1 mRNA is also observed in patients with early arthritis (EA) [320]. However, the expression of VPAC2 is increased in SF and PBMCs show an increased expression of VPAC2 mRNA [38,320]. A deregulated

expression of VPAC2 has also been described in monocytes isolated from patients with SS and in activated CD4<sup>+</sup> T cells from patients with multiple sclerosis [294].

The abnormal expression of the VIP axis in autoimmune diseases directed the investigations toward the study of its association with the clinical course of some of these diseases and their possible prognostic value. In RA, early diagnosis and the establishment of immediate and effective therapy are essential to prevent greater disease severity [321,322]. Although different parameters have been proposed as prognostic markers for RA (such as rheumatoid factor ACPAs, erythrocyte sedimentation rate, and C-reactive protein), they are only capable of classifying 65% of patients [323–326]. In this sense, patients with RA with high or moderate activity after two years of follow-up had lower levels of VIP at baseline [275]. In multivariate analyses, it was observed that ACPA-negative patients had an odds ratio (a statistic that quantifies the strength of the association between two events) of 6.1, having high activity at two years of follow-up if their initial serum VIP levels were low. This allows the classification of a group of patients with a greater need for treatment within the ACPA seronegative RA patients. Another factor that makes VIP a potential prognostic marker worthy of further study is the fact that several single nucleotide polymorphisms (SNPs) in the VIP gene are associated with differences in serum VIP levels in patients with EA. The combination of three SNPs (minor alleles in rs688136 and absence of minor alleles in rs35643203 and rs12201140) in the VIP gene allows the identification of patients with less-severe disease, and thus possibly good candidates for less-intensive therapy [327].

Regarding the VIP receptors, it has been observed that the expression of VPAC1 and VPAC2 could reflect the clinical status in patients with EA with a significantly lower expression of VPAC1 when patients have systemic inflammatory activation characterized by high serum levels of IL6 and higher levels of Disease Activity Score 28 (DAS28). DAS28 is an index of the disease activity developed and validated by the EULAR (European League Against Rheumatism) [320]. In addition, the VPAC2 expression prevailed over VPAC1 in cells polarized toward Th17 of EA patients [170]. VPAC2 can also mediate anti-inflammatory effects when the expression of VPAC1 is low [38].

Serum VIP levels also showed a prognostic value in spondyloarthritis, a family of rheumatic diseases that share clinical and radiological manifestations where the most prevalent group is ankylosing spondylitis. These patients are HLA-B27 positive, and their inflammation usually occurs with enthesitis and bone formation that can lead to ankylosis. Patients with SpA presented a wide heterogeneity in terms of clinical manifestations, and there are no good biomarkers that predict progression. In early SpA, patients with lower VIP levels showed more disability and factors related to increased inflammation (bone edema on MRI scan, anemia, enthesitis, and cutaneous psoriasis) [276]. Finally, it has been described that VIP levels may have a protective role in the progression of OA [231].

In summary, VIP is an excellent aspirant to be used in clinical practice as a prognostic biomarker that would complement existing markers, such as ACPAs. Concerning receptors, they emerge as good candidates for activity biomarkers, and current studies would expand their potential as a severity biomarker.

Figure 5 summarizes the current advances in the role of the VIP/receptors axis as biomarkers in rheumatic diseases.

**Figure 5.** VIP and its receptors as biomarkers in rheumatic diseases. The scheme shows the current advances on the role of the VIP axis and VPAC receptors as biomarkers in spondyloarthritis (SpA), osteoarthritis (OA), and rheumatoid arthritis (RA). ↑Higher. ↓Lower. Purple arrows: association of VIP levels; green arrow: association of VPAC1 expression; red arrow: VPAC2 expression; blue arrow: clinical utility of SNPs (single nucleotide polymorphisms) in VIP gene.

#### **8. Conclusions**

On the 50th anniversary of VIP's discovery, this review updates our knowledge about the regulatory functions of the VIP/receptors axis in the immune system and presents a spectrum of potential clinical benefits applied to inflammatory and autoimmune diseases. This article gathers the findings and advances achieved in this field, thanks to the work of numerous researchers, from both basic and translational research areas.

Recent progress in improving the stability, selectivity, and effectiveness of VIP/receptors analogues and new routes of administration are highlighted, as well as important advances in their use as biomarkers, contributing to their potential application in precision medicine.

Despite the achievements, it is necessary to continue researching the design of analogue drugs that are stable, safe, and directed against specific objectives and in the validation of the VIP/receptors axis as biomarkers such that their application in clinical practice becomes a reality for our Very Important Patients.

**Author Contributions:** C.M., Y.J., I.G.-C., M.C., S.P.-G., A.L., M.M., I.G.-Á., and R.P.G. wrote the manuscript; D.C. and R.V.-R. performed the figures; C.M. coordinated the manuscript; S.P.-G., R.V.-R., and D.C. coordinated the bibliography and the format of the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Fondo de Investigación Sanitaria, Instituto de Salud Carlos III (Grants N◦: PI14/ 00477, PI17/00027, RD16/0012/0008, RD16/0012/0011) and by the Ministerio de Economía y Competitividad (RTC-2015-3562-1), co-financed by Fondo Europeo de Desarrollo Regional (FEDER).

**Acknowledgments:** We are grateful to all patients and the collaborating clinicians for their participation in this study. We are also grateful to Sarah Young for her contribution to the editing of the English manuscript.

**Conflicts of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Prolactin-Releasing Peptide: Physiological and Pharmacological Properties**

#### **Veronika Pražienková 1, Andrea Popelová 1, Jaroslav Kuneš 1,2 and Lenka Maletínská 1,\***


Received: 2 October 2019; Accepted: 23 October 2019; Published: 24 October 2019

**Abstract:** Prolactin-releasing peptide (PrRP) belongs to the large RF-amide neuropeptide family with a conserved Arg-Phe-amide motif at the C-terminus. PrRP plays a main role in the regulation of food intake and energy expenditure. This review focuses not only on the physiological functions of PrRP, but also on its pharmacological properties and the actions of its G-protein coupled receptor, GPR10. Special attention is paid to structure-activity relationship studies on PrRP and its analogs as well as to their effect on different physiological functions, mainly their anorexigenic and neuroprotective features and the regulation of the cardiovascular system, pain, and stress. Additionally, the therapeutic potential of this peptide and its analogs is explored.

**Keywords:** prolactin-releasing peptide; GPR10; RF-amide peptides; food intake regulation; energy expenditure; neuroprotection; signaling

#### **1. Introduction**

There is no doubt that the function of prolactin-releasing peptide (PrRP) in organisms is quite important as its structure is well conserved within different animal species. PrRP is reported to regulate food intake and energy metabolism, but it could have several other specific functions, such as the regulation of cardiac output, stress response, reproduction, the release of endocrine factors, and recently neuroprotective features. The site of the main action of PrRP is the brain, where its release is regulated by a number of stimuli, including those coming from the periphery.

PrRP binds with high affinity to the GPR10 receptor and also has lesser activity towards the neuropeptide FF (NPFF) receptor type 2 (NPFF-R2). In addition, cooperation with other food intake regulating neuropeptides, especially leptin, cholecystokinin (CCK), or neuropeptide Y (NPY), is very important for the effects of PrRP.

In structure-activity relationship (SAR) studies, novel PrRP analogs with attached fatty acids and changes in the amino acid chain were synthetized to overcome the blood-brain barrier and to improve the stability and bioavailability from the periphery, thus representing interesting targets for therapeutic use.

#### **2. Discovery and Structure of PrRP**

PrRP was first isolated in 1998 by Hinuma and colleagues from an extract of bovine hypothalamus and was described as a ligand for the orphan seven-transmembrane-domain receptor (7TM) GPR10 (also known as hGR3 or rat ortholog UHR-1) using reverse pharmacology ([1,2] and reviewed in [3]). The cloned full-length cDNA of the *PrRP* gene is 435 bp in length and encodes an 87 amino acid

long precursor [4]. The *PrRP* rat gene contains three exons and two introns and spans a region of approximately 2.4 kb [5].

The average precursor length is 105 amino acids with two cleavage sites [6]. From the protein precursors, at least two isoforms of different lengths, PrRP20 and PrRP31 (Table 1), are produced. Shorter PrRP20 shares identical C-termini with the longer form of PrRP31. The fish ortholog of PrRP20, C-RFa, was isolated and described by Fujimoto et al. from the brain of *Carassius auratus langsdorfii* in the same year that PrRP was discovered [7]. The cloned cDNA of the *C-RFa* gene is 997 bp in length and encodes a precursor of 108 amino acids [4]. Subsequently, PrRP was identified in amphibians in *Xenopus laevis* in both isoforms [8]. In birds, specifically in *Gallus gallus*, PrRP has a similar sequence as in fishes and amphibians and is also expressed in the brain [9]. Moreover, Wang et al. measured the expression of both PrRP and C-RFa in chickens, as well as in *Xenopus* and zebrafish, suggesting that those peptides are encoded by two separate genes and may play similar yet distinctive roles in nonmammalian vertebrate species [4].

PrRPs in vertebrates share very conserved homology and there is evidence that PrRP evolved from a common ancestry precursor in nonmammalian and mammalian species [10]. The precursor is composed of a hydrophobic N-terminal sequence, paired basic amino acids for the recognition of endopeptidases, and a very conserved C-terminal sequence, where the amino acid glycine is a donor for the amide group. The bovine/human C-terminal octapeptide is Gly-Ile-Arg-Pro-Val-Gly-Arg-Phe-NH2; in fish C-RFa, isoleucine and valine are swapped (Table 1) [6].

The name of PrRP was suggested on the basis of its prolactin-releasing activity in a rat pituitary adenoma-derived cell line and in pituitary cells obtained from lactating rats [1]. Additionally, another study reported that the injection of PrRP stimulated plasma prolactin levels in female rats in proestrus, estrus, and metestrus, and increased doses of PrRP were necessary to increase plasma prolactin in male rats [11]. Nevertheless, this prolactin-releasing function was later questioned because it did not have typical features for hypophysiotropic hormones [12,13]. Currently, PrRP is considered likely to be an anorexigenic (i.e., food-intake-lowering) neuropeptide, which mainly plays a role in the regulation of food intake and energy expenditure [12,14–16], but also regulates stress [17,18], sleep [19,20], and the cardiovascular system [21–23]. In addition, its potential neuroprotective properties have been described [24–26].


Grey color marks same amino acids.

**Table 1.** Sequences of prolactin-releasing peptide (PrRP): PrRP20 and PrRP31 in different animal species [1,4,7].

#### **3. GPR10 Discovery and Gene Location**

Using polymerase chain reaction (PCR), Marchese et al. discovered genes encoding novel G-protein coupled receptors (GPCRs), including the human gene for *GPR10* [27]. GPR10 shares high amino acid identity with NPY receptor 1 (NPY-1R) and orphan receptor induced by glucocorticoids (GIR) [27]. The overall amino acid identity is 31% and 46% in the transmembrane domains for NPY-1R and 30% and 46% in those for GIR. This GPR10 receptor was later confirmed to be identical to orphan hGR3 reported as a receptor for PrRP by Hinuma et al. [1]. Human GPR10 shares high homology (89%) with rat ortholog UHR-1 [2]. The human 1107 bp long gene for *GPR10* is located on chromosome 10 q25.3–q26.1 and a related sequence on chromosome 13 q14.3–q21.1, encoding a 370 amino acid long protein [27].

In nonmammalian vertebrates, fish and chicken PrRP receptor genes are located on chromosome 17 and chromosome 5, respectively [27,28]. GPR10 is well conserved in mammals with more than 90% identity, however in chickens, it is only 54% identical compared with the mammalian counterpart, probably because of phylogenetic differences. The most conserved sequence is on the C-terminus of the receptor, particularly the last six amino acid peptides that could interact with a ligand [29,30]. Both isoforms PrRP20 and PrRP31 bind with high affinity to the GPR10 receptor and rat UHR-1 [31].

Later, it was discovered that PrRP has an affinity for NPFF-R2 [32]. Different studies confirmed the molecular and functional identity of the HLWAR77 receptor, which is a common target for NPFF and neuropeptide AF (NPAF), with NPFF-R2 [33]. Human NPFF-R2 shares 89% amino acid identity with its rat ortholog, high homology with NPY receptors [34], and 37% homology with the orexin-A receptor [33].

#### **4. Distribution of PrRP and its Receptor GPR10**

#### *4.1. Distribution of PrRP*

The highest expression of *PrRP* mRNA was measured in the brainstem in the nucleus of the solitary tract (NTS) and a moderate level was detected in the dorsomedial hypothalamic nucleus (DMN), ventrolateral reticular nucleus of the thalamus (VRT) (Figure 1), and in the periphery in the intestine both in rats and humans when analyzed with reverse transcription-PCR [31,35,36]. Immunoreactive cell bodies were found mainly in the DMN, ventromedial hypothalamic nucleus (VMN), NTS, and ventrolateral medulla oblongata (ME), and nerve projections were present in the paraventricular hypothalamic nucleus (PVN), supraoptic nucleus (SON), DMN, lateral hypothalamic area (LHA), thalamic nucleus, amygdala, and area postrema (AP) (Figure 1) [31]. Immunoreactive fibers were also detected in high concentrations in the posterior pituitary [37,38]. Using enzyme immunoassay for PrRP distribution, immunoreactive PrRP was widely present in the hypothalamus, midbrain and posterior pituitary, and ME [37]. In mammals, rats, and humans, peripheral tissue *PrRP* mRNA was found mainly in the adrenal gland, lung, pancreas, liver, kidney, reproductive organs, and gut [35,37,39,40]. Concentration of PrRP in rat plasma was very low (0,13 fmol/mL) [37]. In chicken tissue, *C-RFa* mRNA was detected in the kidney, lung, reproductive organs, heart, intestine, liver, and pituitary [4]. In the amphibious fish, mudskipper, *PrRP* mRNA expression was observed in the brain, liver, gut, and ovary, with lower levels detected in the skin and kidney [41].

**Figure 1.** Distribution of PrRP and GPR10. Ellipses represent distinct brain areas (blue—nucleus accumbens, grey—corpus callosum, green—hippocampus, red—thalamus, orange—hypothalamus, yellow—pituitary, violet—parabrachial nucleus, light green—medulla oblongata). Stars mark the expression of mRNA (red star—*PrRP*, black star—*GPR10*). Spots represent the distribution of PrRP (red), GPR10 (black) cell bodies and fibers. AP: area postrema, C: cerebral cortex, CC: corpus callosum, CE: cerebellum, DMN: dorsomedial hypothalamic nucleus, HB: hindbrain, HIPP: hippocampus, HYP: hypothalamus, LHA: lateral hypothalamic area, ME: medulla oblongata, MIB: midbrain, NAc: nucleus accumbens, NTS: nucleus of the solitary tract, OF: olfactory bulb, P: pituitary, PB: parabrachial nucleus, PEVN: periventricular hypothalamic nucleus, PVN: paraventricular hypothalamic nucleus, RT: reticular nucleus of the thalamus, SON: supraoptic nucleus, SLM: stratum lacunosum-moleculare, TH: thalamus, VMN: ventromedial hypothalamic nucleus, VRT: ventrolateral reticular nucleus of the thalamus.

#### *4.2. Distribution of GPR10*

The highest expression of *GPR10* mRNA was detected in several parts of the rat brain, mainly in the reticular nucleus of the thalamus (RT), PVN, periventricular hypothalamic nucleus (PEVN) and DMN, AP, and NTS. A moderate level of expression of the receptor was also detected in the anterior pituitary and VMN (Figure 1) [31,42]. Radiolabeled 125I-PrRP31 bound in a specific pattern to the reticular thalamic nucleus and PEVN [31]. GPR10 was also found in the parabrachial nucleus (PB) or nucleus accumbens (NAc), which are areas that are involved in pain processing [31], and in low levels in the hippocampus (stratum lacunosum-moleculare; SLM), which involves areas that are involved in memory [2,26]. In the periphery, *GPR10* mRNA was found in the rat adrenal medulla [35,43,44]. Through the detection of mRNA and in situ hybridization or immunohistochemical studies, PrRP and its receptor were found in discrete areas within the brain and periphery. Indeed, PrRP nerve fibers are in close proximity to areas where GPR10 is present, but PrRP still has to be transported to other sites to be released. This fact may also support the hypothesis that PrRP binding and signaling are not restricted to the GPR10 receptor.

#### **5. PrRP Intracellular Signaling Pathways**

To explore signal transduction pathways and the potential agonist or antagonist properties of PrRP action at GPR10, several studies have been published. Hinuma et al. first reported that PrRP promoted arachidonic acid metabolite release in Chinese hamster ovary (CHO) cells expressing GPR10 [1]. PrRP was able to dose-dependently stimulate calcium release in cells that were transfected with GPR10 in a calcium mobilization assay (Figure 2) [31].

**Figure 2.** PrRP physiological functions and signaling—summary. PrRP and its agonist exerts its effect through GPR10. Blue arrow represents activation of the signaling pathway, T-bar represents blocking of the signaling pathway. PrRP stimulated calcium release (Ca2+) in calcium mobilization assay and rapidly activated extracellular signal-regulated protein kinase (ERK – blue). It also activated c-Jun N-terminal protein kinase (JNK—light blue) and phosphorylated cAMP response element-binding protein (CREB—grey). Pertussis toxin (PTX—black) blocked the ERK and Akt activation induced by PrRP. PrRP activated the PI3K B/Akt-mammalian target of rapamycin (PI3K-Akt-mTOR) pathways in leiomyoma cells (PI3K—pink, PKB/Akt—green,.mTOR—yellow). PrRP significantly stimulated both the PKA (dark green) and PKC (orange) pathways.

PrRP rapidly activated extracellular signal-regulated protein kinase (ERK) from the mitogen-activated protein kinase (MAPK) family in GH3 rat pituitary tumor cells and in primary rat anterior pituitary cultures (Figure 2) [45]. Moreover, pertussis toxin (PTX), which inactivates Gi/Go proteins, completely blocked the ERK activation induced by PrRP, suggesting that at least part of the coupling of GPR10 is through Gi/Go proteins [45]. Kimura et al. also demonstrated that PrRP activated c-Jun N-terminal protein kinase (JNK) in a protein kinase C (PKC)-dependent manner in GH3 rat pituitary tumor cells [45].

PrRP20 was then reported not to alter basal levels of intracellular cyclic AMP in human embryonic kidney HEK293 cells that were transfected with GPR10, suggesting that in this system, GPR10 does not couple through Gs protein, which would activate adenylyl cyclase to increase the cyclic AMP concentration [46]. In addition, PrRP20 did not decrease forskolin-stimulated cyclic AMP levels, indicating that GPR10 does not couple via Gi, which would inhibit adenylyl cyclase and decrease cyclic AMP levels [46]. Therefore, the possible involvement of GPR10 signaling through the Gq pathway was proposed.

Engstrom et al. tested the ability of PrRP20 or PrRP31 to stimulate [35S]GTPγS binding to membranes of CHO cells expressing GPR10; more than 80% of the binding of PrRP was prevented by PTX [32]. Taken together, these data suggest that a large part of the GPR10 coupling occurs via Gi/Go

proteins, however this depends on the cellular system in which the receptor is expressed [32,45,46]. In the study from Engstrom et al., intracellular calcium assays also confirmed the full agonist properties of both PrRP20 and PrRP31 at GPR10 [46].

PrRP rapidly and transiently stimulated the activation of protein kinase B (Akt) in GH3 cells, and a phosphoinositide 3-kinase-protein kinase (PI3K) inhibitor blocked the PrRP-induced activation of Akt (Figure 2) (reviewed in [47]). Additionally, PTX completely blocked the Akt activation induced by PrRP, suggesting the involvement of Gi/Go proteins [48]. PrRP31 significantly induced an increase in the activity of ERKs and JNK, but not p38 MAPK in the rat PC12 pheochromocytoma cell line [49]. Moreover, PrRP stimulated dopamine release and catecholamine secretion and increased tyrosine hydroxylase levels via the protein kinase A (PKA) and PKC pathway in PC12 cells [49,50]. PrRP has also been shown to stimulate adenylyl cyclase in the PC12 cell line and promote the proliferation of cultured cells [51]. The stimulation of the chicken PrRP receptor expressed in CHO cells by PrRP also leads to the activation of the intracellular PKA signaling pathway [4,52].

PrRP activated the PI3K B/Akt-mammalian target of rapamycin (PI3K-Akt-mTOR) pathways and cell proliferation in primary leiomyoma cells, where GPR10 is aberrantly expressed [53]. Maixnerova et al. showed that both PrRP20 and PrRP31 activated ERK and cAMP response element-binding protein (CREB) signaling and induced prolactin release in the rat pituitary cell line RC-4B/C with equal potency (Figure 2) [54]. Additionally, modified analogs of PrRP20 and PrRP31, either with changes in the amino acids at the C-terminus or with lipidization, strongly induced the phosphorylation of the ERK pathway in CHO cells expressing GPR10 [55].

#### **6. Structure-Activity Relationship Studies**

Two isoforms of PrRP with either 20 or 31 amino acids sharing identical C-termini showed comparable in vitro and in vivo activity [1]. Several SAR studies with PrRP analogs were performed [31,56–58]. No study about selective antagonists of PrRP has been published yet, but in 2010, Otsuka Pharmaceuticals patented nonpeptide heterocyclic antagonists derived from tetrahydropyridol [4,3-d]pyrimidinone developed for stress-related diseases (reviewed in [59]).

First, Roland et al. demonstrated that N-terminal deletions from PrRP20 slightly decreased the affinity of the PrRP analogs for GPR10 [31]. The shortest analog that was still able to bind to GPR10 was C-terminal heptapeptide PrRP(25–31). However, this fragment displayed a two order of magnitude decrease in binding affinity compared to that of PrRP20 and PrRP31, which exhibited affinity in the nanomolar range. The replacement of the C-terminal amide group with an acid resulted in a complete loss of binding affinity [1,31]. Moreover, an alanine scan through PrRP(25–31) showed that the arginine at positions 26 and 30 is crucial for binding to the receptor, and their change results in a loss of affinity [31]. D'Ursi et al. described a conformational analysis of PrRP20 using circular dichroism (CD) and nuclear magnetic resonance (NMR) spectroscopies and molecular modeling calculations. The C-terminal region consisted of amphipathic helices with hydrophobic nonpolar side chains of Ala21, Ile25, Val28, and Phe31 and hydrophilic side chains of Arg23, Arg26, and Arg30 [60].

PrRP could be shortened without a loss of in vitro activity to the tridecapeptide PrRP(19–31), H-Trp19-Tyr20-Ala21-Ser22-Arg23-Gly24-Ile25-Arg26-Pro27-Val28-Gly29-Arg30-Phe31-NH2, which has the minimal length for retaining binding affinity and agonist properties [56]. The binding affinity was significantly decreased by further truncation of the peptide; therefore, the active site is located within the C-terminal region. This large SAR study focused on the replacement of amino acids at positions 21 to 31, with a main focus on the phenylalanine at position 31. Nineteen different amino acids were used, but only a bulky side chain His(Bzl), Trp, Cys(Bzl), Glu(Obzl), norleucine (Nle) or a halogenated aromatic ring (Phe(4-Cl)) led to similar or improved binding affinity and good agonist activity [56]. Replacement of Arg<sup>23</sup> by Pro significantly decreased the affinity. The results confirmed that the functionally important residues are located within the C-terminal segment with the essential and irretrievable arginine 30 and the high importance of phenylalanine 31.

Based on a previous study by Boyle et al., Maletínská et al. [58] designed PrRP20 analogs with modifications of Phe<sup>31</sup> by amino acids with different aromatic rings. Phe31 was replaced by (3,4-dichlor)phenylalanine (PheCl2 31), (4-nitro)phenylalanine (PheNO2 31), pentafluoro-phenylalanine (PheF5 31), napthylalanine (1-Nal31, 2-Nal31), or Tyr31. In addition, the amino acids cyclohexylalanine (Cha31) and phenylglycine (Phg31) were included [58]. This study showed that all analogs except [Cha31]PrRP20 and [Phg31]PrRP20 preserved high binding affinity to rat RC-4B/C pituitary cells and increased the phosphorylation of ERK and CREB in this cell line.

DeLuca et al. performed a structural study based on NMR and CD spectroscopy, where they determined the α-helical conformation in trifluoroethanol of the C-terminal sequence of PrRP20 [57]. Shorter PrRP20 analogs, PrRP(4–20), PrRP13 (PrRP(8–20)), and heptapeptide PrRP(14–20), decreased the stability of the helical segment and their biological activity was reduced. Therefore, this stable C-terminal α-helical structure facilitates ligand recognition by the receptor and enables its activation [57].

The lipidization of peptides (i.e., the attachment of fatty acids to peptides through an ester or amide bond) is a useful strategy for designing new peptide drugs. This modification may enhance potency, selectivity, and therapeutic efficacy because it can increase stability and prolong the half-life in an organism. Moreover, it could enable delivery across the blood-brain barrier (reviewed in [61]). This lipidized peptide is liraglutide, an analog of glucagon-like peptide 1 (GLP-1) that is palmitoylated at position 26 via a γ-glutamyl linker [62], with a strongly prolonged half-life [63]. Therefore, the lipidization of neuropeptides that is involved in food intake regulation might be a new way for the development of drugs for the treatment of obesity (reviewed in [64]).

Maletínská et al. designed novel lipidized PrRP analogs with fatty acids of different lengths attached to the N-terminus [55]. All PrRP20 and PrRP31 analogs lipidized with octanoic, decanoic, dodecanoic, myristic, palmitic, and stearic acid had agonist characteristics and preserved high binding affinity to GPR10 compared to native PrRP20 or PrRP31 [55].

Lipidized PrRP31 analogs with noncoded amino acids 1-Nal, PheCl2, PheNO2, PheF5, or Tyr at position 31 and myristoylated or palmitoylated on the N-terminus revealed high binding potency to GPR10. The original methionine at position 8 was replaced by the more stable Nle to avoid oxidation of Met without any loss of binding and signaling activity [65].

Analogs of PrRP31 where palmitic acid was attached through the γ-glutamyl linker or the short chain of polyethylene glycol at Lys11 or analog with two palmitic acids at Lys11 and at the N-terminus were tested both in in vitro and in vivo studies. Binding and signaling experiments showed preserved binding affinity to GPR10, although the analog with two palmitic acids was less potent. The attachment of the single palmitic acid could be performed on different positions of the chain without the loss of binding affinity [66].

Recently, a new study by Pflimlin et al. was published in which novel long-lasting PrRP analogs with staples incorporating multiple ethylene glycol-fatty acids (MEG-FAs) were synthetized [67]. Crucial arginines at positions 23 and 30 were replaced with homoarginine (hArg), beta-homoarginine (β-hArg), and N-methylarginine (Nme-Arg). All modifications at Arg<sup>30</sup> significantly affected the potency. In Arg23, only substitution by Nme-Arg, but not by hArg or β-hArg, decreased the affinity. All synthetized analogs contained dicysteine mutations, the best tolerated of which occurred at positions 6–13, 15–22, and 18–25. As lead compounds, they chose the PrRP analog 18-S4, an analog with Cys6, Cys13, Nle11, and hArg23 and stapled at cysteines by staple featuring four ethylene glycol units attached to octadecanedioic acid via a lysine linker incorporating a carboxylated moiety. They generated analogs with in vitro selective agonist activity towards GPR10 [67]. The structure of all of the mentioned PrRP analogs is described in Figure 3.

**Figure 3.** Structure of PrRP20 and PrRP31 and its analogs [31,55,56,58,65,67]. Blue amino acid Ser1 marks the beginning of PrRP31 from the N-terminus. Blue amino acid Thr12 marks the beginning of shorter isoform PrRP20 (also known as PrRP(12–31)). Blue amino acid Trp19 marks tridecapeptide PrRP(19–31) and blue Ile25 marks the shortest fragment heptapeptide PrRP(25–31). Green amide group NH2 and green Arg23, Arg26, Arg30, and Phe31 amino acids are essential for the functionality of the peptide. Light grey amino acids mark changes of amino acids that preserved good functional activity. Dark grey Arg<sup>11</sup> could be substituted by Lys<sup>11</sup> and its secondary amino group, fatty acids, were attached through different linkers. γ-E: γ-glutamic acid, MEG-FA: multiple ethylene glycol-fatty acid (four ethylene glycol units attached to octadecanedioic acid via lysine linker incorporating carboxylated moiety), PheCl2: (3,4-dichlor)phenylalanine, PheNO2: (4-nitro)phenylalanine, PheF5: pentafluoro-phenylalanine, 1-Nal, 2-Nal: napthylalanine, Phg: phenylglycine, TTDS: short chain of polyethylene glycol (1,13-diamino-4,7,10-trioxadecan-succinamic acid).

#### **7. PrRP in the Regulation of Food Intake and Energy Expenditure**

#### *7.1. PrRP Decreases Food Intake and Regulates Energy Homeostasis*

First, it was shown that PrRP caused the release of prolactin from cultured pituitary cells [1], but later, other studies showed the main role of PrRP to be in food intake regulation ([12,14,68] and this is reviewed in [64]).

Lawrence et al. suggested an alternative role for PrRP as a regulator of energy homeostasis and food intake [14]. Intracerebroventricular (ICV) injection of PrRP caused a reduction in food intake in fasted and free-fed rats [14]. Moreover, the subsequent decrease in body weight was not only due to the reduction in food intake, which implies an effect on energy expenditure. They supported the findings by measuring *PrRP* mRNA, which was highly expressed in the hypothalamus, NTS, and ventrolateral ME, and *GPR10* mRNA in the RT, PEVN and DMN, and NTS, all areas of which are implicated in the regulation of food intake (reviewed in [69–71]).

PrRP also mediated some of the central satiating actions of the gut peptide hormone CCK [12]. The measurement of the induction of c-Fos protein showed that PrRP neurons were strongly activated by the intraperitoneal injection of CCK, and central PrRP administration activated areas of the brain that are common for both PrRP and CCK [12]. Ellacott et al. suggested that the anorexigenic action of PrRP is regulated by the adiposity signal leptin [72]. ICV administration of PrRP and leptin resulted in reduced food intake in rats and an increase in body temperature compared with each peptide alone. Additionally, using in situ hybridization, *PrRP* mRNA levels were reduced in fasting and obese Zucker rats, indicating that *PrRP* expression is regulated by leptin [72].

Repeated ICV injection of PrRP strongly reduced food intake and body weight in rats without causing any adverse behavior on locomotor or sensor motor activity [73]. PrRP exerted an effect on energy homeostasis in the short to medium term and increased energy expenditure [74].

Through the generation of *GPR10* knockout (KO) mice with targeted deletion of the *GPR10* gene, GPR10 was confirmed to be a major receptor for PrRP in the hypothalamus because this deletion completely prevented PrRP binding to hypothalamic cell membranes [75]. *GPR10* KO mice become hyperphagic and mildly obese at older ages and develop decreased glucose tolerance with elevated levels of insulin and leptin [75]. Male and female *GPR10* KO mice had increased body weight as a consequence of increased fat mass compared to their wild-type (WT) controls [76]. The total levels of plasma leptin and cholesterol were increased, and a decrease in energy expenditure was observed in *GPR10* KO mice [76]. In fasted or satiated *GPR10* KO mice, ICV administration of PrRP did not reduce food intake in contrast to their WT controls. The administration of CCK did not result in the inhibition of food intake in *GPR10* KO mice, suggesting that PrRP is involved in central satiating actions of CCK [77]. *PrRP* KO mice had higher blood glucose levels and corticosterone levels and became obese with higher amounts of adipose or liver tissue than control WT animals [78]. Under stress conditions, *PrRP* KO mice showed increased levels of plasma corticosterone compared to WT mice, which might indicate that PrRP regulates glucose metabolism through corticosterone secretion and⁄or catecholamine synthesis [78].

PrRP was also shown to mediate its anorexigenic effect through corticotropin-releasing hormone (CRH) receptors, but not through melanocortin receptors [68]. ICV administration of PrRP elevated adrenocorticotropin (ACTH) levels in plasma, and c-Fos protein was increased in the nuclei of CRH-positive cells in the PVN [79,80]. PrRP-positive neurons have synapse-like contact with CRH cell bodies in the PVN [79]. Furthermore, the injection of PrRP directly into the PVN caused an increase in plasma ACTH [81]. Using hypothalamic explant incubations, researchers showed that PrRP increased hypothalamic CRH release, which is one of the principal ACTH secretagogues, and the subsequent secretion of ACTH. Therefore, an additional potential role of PrRP in the function of the hypothalamic-pituitary-adrenal (HPA) axis and in the cardiovascular system was suggested [23,81].

#### *7.2. Ortholog C-RFa in Food Intake Regulation*

Similar to the anorexigenic action of PrRP in mammals, ICV injection of ortholog C-RFa also inhibited food intake in goldfish [82]. However, a completely opposite result was observed in chicks, where ICV injection of rat PrRP31 significantly increased food intake, and the orexigenic effect of NPY was enhanced with the coadministration of PrRP [83]. ICV injection of ortholog C-RFa did not affect food intake in chickens [84].

#### *7.3. PrRP Analogs in the Regulation of Food Intake and Energy Expenditure*

The C-terminal 20 amino acids of PrRP (PrRP20) are crucial for preserving the full food-intake-lowering effect. ICV administration of PrRP20 analogs with PheCl2 31, PheNO2 31, PheF5 31, 1-Nal31, 2-Nal31, or Tyr31 resulted in decreased food intake in fasted mice [58]. In particular, [PheNO2 31]PrRP20, [1-Nal31]PrRP20, [2-Nal31]PrRP20, and [Tyr31]PrRP20 showed the most significant and long-lasting anorexigenic effect after ICV administration in fasted lean mice. This study showed that a bulky aromatic ring, not necessarily phenylalanine at the C-terminus, enabled full anorexigenic activity [58].

PrRP acts centrally; therefore, the potential of PrRP to decrease food intake after peripheral administration depends on reaching the receptors in the brain and enabling the central effect. Of the analogs with different length fatty acids attached at the N-terminus of PrRP, only myristoylated PrRP20 (myr-PrRP20), palmitoylated (palm-PrRP31), and stearoylated PrRP31 significantly lowered food intake in fasted or freely fed lean mice after subcutaneous (SC) administration [55]. Therefore, those analogs were suggested to probably cross the blood-brain barrier because they caused the central effect after peripheral administration. Analogs containing shorter fatty acids had no effect on food intake. Moreover, analogs palm-PrRP31 and myr-PrRP20, but not natural PrRP20 and PrRP31 or octanoylated PrRP31, showed longer stability in rat plasma and significantly increased c-Fos immunoreactivity in hypothalamic and brainstem nuclei that are involved in food intake regulation, such as PVN, ARC, and NTS.

A significant increase in c-Fos was observed in the PVN, ARC, NTS, and DMN after SC administration of palm-PrRP31. Moreover, palm-PrRP31 administration significantly increased c-Fos in the LHA hypocretin neurons and PVN oxytocin neurons [85].

Palmitoylated or myristoylated PrRP31 analogs with C-terminal changes reduced acute food intake after SC administration in fasted lean mice [65] (reviewed in [64]). Of all the lipidized PrRP analogs, [PheCl2 31]PrRP31 palmitoylated or myristoylated at the N-terminus showed the strongest and long-lasting anorexigenic effect in fasted mice [65]. In free-fed Wistar rats, palm-PrRP31 strongly reduced food intake when injected peripherally. Peripheral injection of palm-PrRP31 induced the increase of c-Fos protein in the PVN, NTS, and ARC, which are specific brain regions that are involved in food intake regulation [86].

In diet-induced obese (DIO) mice, a 2-week-long SC administration of palm-PrRP31 and myr-PrRP20 significantly lowered food intake, decreased body weight, improved metabolic parameters such as plasma insulin and leptin, and attenuated lipogenesis compared to lean controls [55].

Repeated administration of PrRP analogs palmitoylated through different linkers to Lys11 but not analog with two palmitic acids reduced body and liver weights and the levels of plasma insulin, leptin, triglycerides, cholesterol, and free fatty acid in DIO mice. Moreover, the expression of *uncoupling protein 1* (*UCP-1*) was increased in brown adipose tissue (BAT), suggesting an increase in energy expenditure [66]. A single dose of PrRP31 palmitoylated at Lys<sup>11</sup> through a γ-glutamyl linker (palm11-PrRP31) again caused neuronal activation and decreased food intake, suggesting its central effect after peripheral administration [66]. This lipidized analog palm11-PrRP31 increased the neural activity, represented by increased FosB immunostaining, only in the DMN and in VMN among the analyzed brain nuclei involved in food intake regulation [87].

The chronic effect of palm-PrRP31 was studied in DIO Sprague-Dawley rats and leptin receptor-deficient Zucker diabetic fatty (ZDF) rats, where palm-PrRP31 was intraperitoneally administered for two weeks. Palm-PrRP31 lowered food intake and body weight, improved glucose tolerance, and tended to decrease leptin levels and adipose tissue in DIO rats [88]. In contrast, the administration of palm-PrRP31 lowered food intake, but it did not significantly affect body weight or glucose tolerance in ZDF rats.

Repeated administration of the lipidized PrRP analog palm11-PrRP31 improved glucose tolerance in Koletsky-spontaneously hypertensive obese (SHROB) rats, which have mutations in their leptin receptor and, therefore, impaired leptin signaling [89]. These findings suggest that the effect of palm11-PrRP31 on glucose metabolism is independent of leptin signaling and body weight lowering. Treatment with palm11-PrRP31 also decreased body weight in control spontaneously hypertensive rats (SHRs), but not in SHROB rats. It seems that the palm11-PrRP anorexigenic effect depends on the proper leptin signaling. Moreover, in SHROB rats, palm11-PrRP31 ameliorated the insulin/glucagon ratio and increased *insulin receptor substrate 1* and *2* expression in fat and insulin signaling in the hypothalamus, while it had no effect on blood pressure [89]. An increase in all parameters mentioned pointed to a beneficial effect of palm11-PrRP on the diabetic state. Additionally, in SHRs and normotensive Wistar Kyoto (WKY) rats on a high-fat diet, treatment with palm11-PrRP31 lowered body weight and improved biochemical and biometric parameters. Palm11-PrRP31 also improved glucose tolerance in WKY rats [90].

Novel long-lasting PrRP analogs with cysteine mutations and staples with attached octadecanedioic acid enhanced plasma stability and half-life in mice. In a 12-day SC administration, the 18-S4 analog significantly reduced body weight in DIO mice [67].

Taken together, PrRP and palmitoylated PrRP analogs are anorexigenic peptides that strongly reduce food intake by reducing appetite and impact energy expenditure under the control of leptin. Proper leptin signaling is necessary for the anorexigenic effect of PrRP and its analogs. Palmitoylated PrRP analogs activate c-Fos in specific neuron populations that are connected to the regulation of food intake. Moreover, lipidization prolonged the half-life of PrRP analogs and enabled central action, leading to a strong food-intake-lowering effect after peripheral administration in mice and rats [55,64,66].

#### **8. Neuroprotective Properties of PrRP**

Obesity and type 2 diabetes mellitus were recently identified as risk factors for the development of neurological disorders, such as Alzheimer's disease (AD). Thus, anorexigenic and/or antidiabetic substances began to be examined as compounds with potential neuroprotective properties. This potential is supported by the finding that receptors of anorexigenic peptides, such as GPR10 or the GLP-1 receptor, are expressed in the hippocampus, which is the first brain region affected during AD.

Extracellular senile plaques formed by aggregated β-amyloid protein (Aβ) and intracellular neurofibrillary tangles formed by hyperphosphorylated tau protein are two hallmarks of AD [91,92]. However, other pathological features are observed in AD patients, such as decreased synaptic plasticity and neurogenesis or increased neuroinflammation [93].

The neuroprotective properties of the lipidized PrRP analogs palm-PrRP31 and palm11-PrRP31 were examined in vitro as well as in vivo in several rodent models of neurodegeneration. The results were reviewed in depth by Maletínská et al. [94].

The effect of human PrRP31 and its lipidized analog palm11-PrRP31 on tau hyperphosphorylation was examined in vitro using a model of hypothermia in the neuroblastoma cell line SH-SY5Y and on rat primary cortical neurons. Hypothermic conditions resulted in increased tau hyperphosphorylation at several epitopes, including pThr212 and pSer396/pSer404, in both cellular models. In SH-SY5Y, incubation with palm11-PrRP31, as well as with PrRP31, attenuated tau hyperphosphorylation at pThr212. In primary cortical neurons, palm11-PrRP31 decreased tau hyperphosphorylation at both pThr212 and pSer396/pSer404. On the other hand, human PrRP did not affect phosphorylation at pThr212 or at pSer396/Ser404 in primary cortical neurons [95].

The effect of PrRP on tau hyperphosphorylation was extensively studied in vivo using different mouse models. Mice with obesity induced by monosodium glutamate (MSG mice) [96,97] develop increased tau hyperphosphorylation due to central insulin resistance manifested by decreased activation of the insulin signaling cascade. Palm-PrRP31 ameliorated the activation of the insulin signaling cascade and subsequently decreased tau phosphorylation at several epitopes, such as pThr231 and pSer396 [26]. A similar effect on tau hyperphosphorylation was observed in the THY-Tau22 mouse model, where the intervention with palm11-PrRP31 also improved short-term spatial memory in the Y-maze test and increased synaptic plasticity compared to the vehicle-treated group [25]. The modulation of synaptic transduction was also examined in a study by Lin et al. [30], where they showed that GPR10 modulates the scaffolding and trafficking of the glutamate-gated cation channel α-amino-3-hydroxy-5-methylisoxazole-4-propionic acid receptor to the postsynaptic membrane, which is necessary to mediate fast excitatory transmission in the brain.

APP/PS1 mice, which are double transgenic mice expressing mutated amyloid precursor protein (APP) (Swedish mutation, K595N/M596L) and mutated presenilin (PS1) (deltaE9 PS1) exon deletion, are one of the most frequently used models to study Aβ pathology [98]. Treatment with the lipidized analog palm11-PrRP31 decreased the amount of senile Aβ plaques in APP/PS1 mice. Moreover, palm11-PrRP31 lowered the markers of neuroinflammation that are colocalized with Aβ plaques—ionized calcium-binding adapter molecule 1 (Iba1), which is a marker of activated microglial cells, and glial fibrillary acidic protein (GFAP), which is a marker of reactive astrocytes. Potential neuroprotective properties are further manifested by increased levels of doublecortin, a marker of neurogenesis, in hippocampi [24].

In conclusion, palmitoylated analogs of PrRP31 seem to be potential tools to treat neurological disorders. However, the mechanism of action remains unclear and must be further studied.

#### **9. Other Physiological Functions of PrRP**

*PrRP* and *GPR10* are expressed in many brain regions that control different physiological functions. It seems that PrRP plays an important role in the stress response (reviewed in [99]). PrRP-producing neurons in the ME were activated in response to some stressful stimuli, such as foot shock stress [100]. Moreover, *PrRP* KO mice were found to react differently to restraint stress than their WT littermates; *PrRP* KO mice have increased blood glucose and corticosterone levels [78]. This study was supported by the finding that neurons producing noradrenaline, which is known as a stress mediator in the CNS, are colocalized with PrRP neurons in the NTS and ventral and lateral reticular nuclei in the ME, and coadministration of PrRP and noradrenaline synergistically increased the release of pituitary ACTH [18]. In NTS, PrRP immunopositive neurons are located in close proximity to GLP-1 immunopositive neurons and signaling, though GLP-1R modulates the activity of PrRP neurons [101]. Both neuronal populations are activated after exposure to stressors and seem to contribute to the central control of stress. The PrRP neural populations from ME were projected to the PVN in the hypothalamus, where CRH and oxytocin, both of which are modulators of the stress response, are produced [79]. Consistent with this, ICV administration of PrRP increased the level of corticosterone and oxytocin in the blood. In addition, the administration of PrRP antibodies abolishes stress-induced activation of PVN and attenuated oxytocin release to the blood [102]. The coadministration of PrRP and astressin, a CRH receptor antagonist, blocked ACTH release; thus, the CRH receptor is important for PrRP action [68]. The physiological role of PrRP is well reviewed by Lin [29], Dodd et al. [3], and Quillet et al. [103].

The effect of PrRP on CRH release could be responsible for the increased heart rate and blood pressure that was observed after central PrRP administration [23]; thus, PrRP could be involved in the regulation of the cardiovascular system (reviewed by [22]). It seems that the effect of PrRP on the increase in blood pressure is not mediated by GPR10 since PrRP was able to increase blood pressure in Otsuka Long-Evans Tokushima Fatty (OLETF) rats that have mutated *GPR10* [104].

A high density of GPR10-producing neurons is observed in the PB, which is responsible for the regulation of nociception. These neurons also produced enkephalins, which are pain suppressors that bind to opioid receptors, which suggests the control of enkephalin production by PrRP [105]. The role of PrRP in nociception is supported by the finding that *GPR10* KO mice have a higher nociceptive threshold and increased stress-induced analgesia. Thus, PrRP could act as a potential antagonist of the opioid system [106].

It was also demonstrated that PrRP may affect the function of chromaffin cells because PrRP and its receptor are highly expressed in the adrenal medulla [39,44]. Moreover, PrRP-immunopositive cells were found in the rat adrenal gland [107]. On the basis of these results, it was suggested that PrRP may play an important role in modulating catecholamine secretion [49].

Due to its distribution, PrRP could also be involved in sexual and reproductive function or in sleep and the control of circadian rhythms (in the ME) [19,20]. PrRP is expressed in brain areas that are implicated in reproduction (in the DMN, ME) and also in periphery in rat testis and epididymis [39,59]. Feng et al. suggested that PrRP could be involved in the regulation of the female rat estrous cycle [108]. Brain *PrRP* mRNA level was higher in the proestrus and estrus in female rats. Moreover, they found colocalization of GPR10 immunoreactive neurons and gonadotropin-releasing hormone in the hypothalamic medial preoptic area. The study by Maruyama et al. showed that ICV administration of PrRP increases plasma oxytocin in rats and they suggested the role of PrRP as a neuromodulator of oxytocin neurons in the brain [109]. There is also some evidence that PrRP is involved in lactation and that PrRP levels are regulated by hormonal changes [100].

#### **10. Conclusions**

PrRP, with its conservative RF-amide sequence on the C-terminus, is a potent anorexigenic neuropeptide, decreasing food intake and enhancing energy metabolism. Moreover, it regulates other physiological functions, such as the cardiovascular system, stress, and reproduction, and has neuroprotective properties. These functions are mainly mediated through the receptor GPR10.

The use of specific model systems, particularly *PrRP*/*GPR10* KO animals, can contribute to an understanding of the molecular mechanisms of PrRP action, thereby contributing to a faster use of PrRP analogs for potential therapy. From our several recent studies, it is clear that lipidized PrRP analogs could have therapeutic potential. Further progress in the development of selective PrRP analogs may contribute to their use not only in the treatment of obesity, but also in the treatment of other metabolic or neurodegenerative diseases.

**Author Contributions:** V.P. collected the bibliography, wrote the manuscript, and prepared the figures; A.P. contributed to the writing; J.K. contributed to the figure design; J.K. and L.M. conceived the topic and revised the review.

**Funding:** This research was funded by the Grant Agency of the Czech Republic (grant number 18-10591S) and the Academy of Sciences of the Czech Republic (RVO: 61388963).

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Abbreviations**




#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Chemotactic Ligands that Activate G-Protein-Coupled Formylpeptide Receptors**

#### **Stacey A Krepel and Ji Ming Wang \***

Cancer and Inflammation Program, Center for Cancer Research, National Cancer Institute at Frederick, Frederick, MD 21702, USA

**\*** Correspondence: wangji@mail.nih.gov; Tel.: +1-301-846-6979

Received: 19 June 2019; Accepted: 5 July 2019; Published: 12 July 2019

**Abstract:** Leukocyte infiltration is a hallmark of inflammatory responses. This process depends on the bacterial and host tissue-derived chemotactic factors interacting with G-protein-coupled seven-transmembrane receptors (GPCRs) expressed on the cell surface. Formylpeptide receptors (FPRs in human and Fprs in mice) belong to the family of chemoattractant GPCRs that are critical mediators of myeloid cell trafficking in microbial infection, inflammation, immune responses and cancer progression. Both murine Fprs and human FPRs participate in many patho-physiological processes due to their expression on a variety of cell types in addition to myeloid cells. FPR contribution to numerous pathologies is in part due to its capacity to interact with a plethora of structurally diverse chemotactic ligands. One of the murine Fpr members, Fpr2, and its endogenous agonist peptide, Cathelicidin-related antimicrobial peptide (CRAMP), control normal mouse colon epithelial growth, repair and protection against inflammation-associated tumorigenesis. Recent developments in FPR (Fpr) and ligand studies have greatly expanded the scope of these receptors and ligands in host homeostasis and disease conditions, therefore helping to establish these molecules as potential targets for therapeutic intervention.

**Keywords:** formyl peptide receptors; ligands; diseases

#### **1. Properties of FPRs**

Formylpeptide receptors (FPR, Fpr for expression in mice) are G-protein-coupled receptors and were incidentally the first GPCRs to be identified in neutrophils [1]. Though initially cloned from neutrophils, FPRs have since been identified in macrophages, endothelial cells, intestinal epithelial cells, fibroblasts, and others [2–5]. Humans possess three different forms of FPRs: FPR1, FPR2, and FPR3. FPR1 was the first named of the receptors, and it was initially discovered as the receptor for the formylated bacterial product formyl-methionine-leucyl-phenylalanine (fMLF), the name of which gave rise to the naming of the receptor in question [6]. FPR1 is most highly expressed in cells in the bone marrow and immune system, though it has some expression notable in cells of the lungs, brain, and gastrointestinal (GI) tract, among others [7,8]. FPR2 was the second discovered of these receptors, but it tends to be the more ubiquitously expressed of the two. The expression is primarily in cells of the bone marrow, immune system, GI tract, female organ tissues, and endocrine glands, though there are also some lower levels of expression in cells of the brain, liver, gallbladder, and pancreas [8,9]. There is little known about the biological significance of FPR3, and very little research has been done to elucidate its role. This receptor is mainly expressed in monocytes and dendritic cells but not neutrophils, and it resides in intracellular vesicles rather than on the cell surface like its counterpart receptors. FPR3, contrary to the other FPR variants, also has only one known endogenous peptide agonist [10,11].

The primary roles of FPRs involve cell chemotaxis in response to agonists, and new research has shown that they even contribute to direct phagocytosis of bacteria by neutrophils [12,13]. Activation of such receptors is also important for wound healing and gut development [14,15]. However, while FPRs were initially thought to only be responsible for neutrophil chemotaxis, FPR1 and FPR2 have both been shown to play pivotal roles in the progression of multiple diseases. For example, FPR2 may promote the malignancy of colon cancer, while FPR1 has similarly been tied to the progression of glioblastoma [16,17]. Conversely, FPR1 has demonstrated tumor suppressor functions in gastric cancer [18]. With such dual roles in cancer progression, it is clear that further mechanistic studies would contribute greatly to our understanding of FPRs, thus potentially leading to new therapeutics. While aberrant expression or activation of FPRs can be detrimental, somewhat contradictory findings demonstrate that constitutively active FPR was indispensable in the defense against the formation of biofilms by *Candida albicans*, as well as aggressive infiltration by *Vibrio harveyi* [19]. FPR-mediated cell activation does not solely include chemotactic and pro-inflammatory responses to pathogens, as activation also plays a very important role in the protection against other pathologies. Gobbetti et al. [20] demonstrated that Fpr2 confers protection against sepsis-mediated damage in mice. Furthermore, FPRs have been reported to mediate anxiety-related disorders and the resultant altered behaviors [21]. Lastly, a novel discovery for FPRs suggests that they act as mechanoreceptors in arteries, making them critical for proper arterial plasticity [22]. Given such diverse functions of these receptors, it should come as no surprise that FPRs respond to a plethora of ligands with diverse classifications.

While most FPR ligands induce cell chemotaxis, calcium flux, and even phagocytosis, they stimulate many other cell functions as well [23]. For instance, some ligands elicit inflammatory processes for the clearance of infection, recruitment of immune cells, etc. while other ligands activate pro-resolving, anti-inflammatory pathways. There are few chemoattractant GPCRs capable of transmitting both pro- and anti-inflammatory signals. This duality in FPR2 is initially determined by the nature of the ligands. Bacterial and mitochondrial formylated peptides are among those that classically activate a pro-inflammatory cell response to clear invaders and tissue damage, while Annexin A1 (Anx A1) and Lipoxin A4 (LXA4) are some of the better-known anti-inflammatory FPR2 ligands [24]. Cooray et al. [25] demonstrated that the switch between FPR2-mediated pro- and anti-inflammatory cell responses is due to conformational changes of the receptor upon ligand binding: binding of anti-inflammatory ligands such as Anx A1 caused FPRs to form homodimers, which led to the release of inflammation-resolving cytokines like IL-10. Conversely, inflammatory ligands such as serum-amyloid alpha (SAA) did not cause receptor homodimerization.

Though some of the diverse FPR (Fpr) ligands are small-molecules or non-peptides, the majority are small peptides that are either synthetic or natural with origins ranging from host and multicellular organisms to viruses and bacteria. These peptides have been extensively studied and patterns of recognized elements have begun to emerge. As is demonstrated below, the presence of formylated methionine in the peptide is generally an activator of FPR1, while FPR2 is less dependent upon this particular residue [26]. Expanding upon this, Bufe et al. [27] concluded that FPR recognition of bacterial peptides requires either a formylated methionine at the N-terminus or an amidated methionine at the C-terminus of a peptide, though they believe that as a general principle, the secondary structure rather than the primary sequence is important for recognition of the highly diverse ligands by FPR. One class of ligands which shall be shortly discussed is comprised of phenol-soluble modulins, and examination of these has led to the conclusion that FPR1 favors short, flexible structures while FPR2 has binding preference for longer peptides which are amphipathic in nature and may contain alpha helices [28]. The class of ligands mentioned above represents a small proportion of the known FPR agonists today, and this review will focus on formylated, microbe-derived, and mitochondrial peptides, as well as host and non-microbial, non-host peptides. Host-derived non-peptides, as well as synthetic or small-molecule ligands will also be discussed.

#### **2. Formylated Peptides**

The prototypic FPR ligand is fMLF, which was the first classified FPR agonist and also represents the shortest sequence to elicit a potent receptor response. While Fpr2 is generally considered the more promiscuous FPR, fMLF preferentially activates FPR1 with high affinity [29]. It has been shown that non-formylated bacterial peptides are much less potent than their formylated counterparts; a suggested reason for this is due to the use of formyl-methionine as a Gram-negative start codon, therefore marking a protein as pathogenic from the perspective of the host immune system [30]. There are many derivatives of fMLF which elicit FPR responses, many of which preferentially activate FPR2 rather than FPR1. Such derivatives include the peptide sequences fMLFII, fMLFIK, fMLFK, and fMLFW, among others [31]. Liu et al. [32] summarized many other formylated peptides that elicit responses through both FPR1 and FPR2, including f-MIFL, f-MIVIL, f-MIGWII, and f-MFEDAVAWF. Dozens of similar peptides have been isolated from equally numerous bacterial genera including *Streptococcus*, *Haemophilus*, *Salmonella*, *Hydrogenobacter*, *Listeria*, *Neisseria*, *Staphylococcus*, and others [27]. PSMα is another formylated peptide that has shown great efficacy in FPR activation. Thus far, the phenol-soluble modulins that have demonstrated a capacity for FPR activation include β2, α1, and α2, all of which are virulence factors isolated from *Staphylococcus aureus*. All three peptides activate FPR2, though the structural basis for activation remains unknown [33].

It is interesting to note that mammalian cell mitochondria, well-known for being bacterial in origin, also contain peptides that elicit FPR-mediated responses. However, while bacterial formylated peptides are considered pathogen-associated molecular patterns (PAMPs), the mitochondrial peptides are generally associated with cellular damage and are thus considered damage-associated molecular patterns (DAMPs) that elicit an inflammatory response [34]. These peptides, some of which include MMYALF, MFADRW, and Nle-LF-Nle-YK, have been tied to constriction of airways in the lungs, as well as neutrophil accumulation and other proinflammatory responses [35,36]. Likewise, lung diseases—including acute respiratory distress syndrome—have been found to have a higher presence of formylated mitochondrial peptides in bronchoalveolar lavage fluid, suggesting that lung inflammation is tightly tied to Fpr1 activation via these DAMPs [37]. Mitocryptide-2 (MCT-2) is another mitochondrial peptide. It is related to Cytochrome B of the electron transport chain and activates FPR2. Interestingly, while the N-terminal formylated methionine is entirely necessary for FPR2 activation by MCT-2, the carboxy-terminal residues are also important for receptor activation. Additionally, sequence analysis has revealed that the presence of the residues Thr7 and Ser8 can activate FPR2, though the peptides were most potent with Ile7 and Asn8 residues [38].

Another important mitochondrial peptide is the nicotinamide adenine dinucleotide (NADH) reductase subunit 1 (ND-1), which elicits a strong inflammatory response via FPR1 [39]. Bufe and Zufall [40] have successfully created a model that accurately predicts known mitochondrial peptide agonists for FPRs. Therefore, there is a strong likelihood for the discovery of many new mitochondrial, host-derived FPR agonist peptides. Another DAMP that elicits FPR-mediated responses is Mitochondrial Transcription Factor A (Tfam). While this necrosis-associated peptide has been previously shown to activate FPRs, activation does not appear to play a crucial role in the inflammatory responses of monocytic microglia in the brain; this may be due to lower FPR expression in this cell type [41,42]. The formylated peptide ligands for FPRs (Fprs) are listed in Table 1, and the mitochondrial peptides are listed in Table 2.


**Table 1.** Formylated bacterial peptide agonists for formylpeptide receptors (FPRs) (Fprs).

**Table 2.** Mitochondrial peptide ligands for FPRs (Fprs).


#### **3. Microbe-Derived Peptides**

Table 3 lists the microbe-derived FPR (Fprs) ligands. While formylated peptides first drew the attention of the scientific community to FPRs (Fprs), there are many other bacterial/viral peptides that are not necessarily formylated but which nevertheless elicit receptor responses. Although the majority of formylated microbial peptides preferentially activate FPR1, the preferred receptor for non-formylated peptides is FPR2 [26]. A large percentage of these non-formylated microbe-derived peptides are viral, and many of them are derived from the Human Immunodeficiency Virus (HIV) envelope proteins, including gp41 T20/DP178, gp41 T21/DP107, gp120 V3 loop, gp41 N36, gp120 F, and gp41 MAT-1 [44–46]. Despite the potential importance of FPRs in HIV research, very little work has been done to further explore this connection. However, Li et al. [47] demonstrated that persistent FPR activation desensitized host CCR5 and CXCR4 co-receptors to HIV proteins, thus reducing viral entry and subsequent replication. Still other viruses, including Hepatitis C Virus, HKU-1 coronavirus, and Herpes Simplex Virus, produce chemotactic ligands C5a, N-formyl HKU-1 coronavirus peptide, and gG-2p20, respectively, for FPR1 or FPR2 activation [48–50]. There is, however, some argument as to the efficacy of the Herpes Simplex viral peptide as an FPR agonist, as the overlapping sequence gG-2p19 was unable to definitively demonstrate that FPR activation played a significant role in the NK response to this virus [51].


**Table 3.** Microbe-derived peptide ligands for FPRs (Fprs).

Mills [49] used the sequence homology of T20/DP178 to further determine that the OC43 Coronavirus, 229E Coronavirus, NL36 Coronavirus, and even the Ebola Spike Protein were all peptides with aromatic-rich domains that elicited FPR-dependent cell activation. Interestingly, when examined from the context of the FPRs rather than the ligands, it was found that domain variability in the receptors determined ligand binding and subsequent cellular responses. This led to the conclusion that the variability of receptors among individuals might predispose or protect against certain viral infections, the susceptibility of which may be determined by receptor activation. In terms of non-viral and non-formylated microbe-derived peptides, there are few FPR agonists. Certain peptides from different strains of *Enterococcus faecium* have demonstrated FPR activation properties, though the ligand activity is not entirely predictable based on structure. Interestingly, *E. faecium* strains that are resistant to vancomycin contain potent FPR2 agonists, suggesting a potential role for FPR2 in antibiotic-resistant infections [52].

#### **4. Host-Derived FPR Ligands**

There are many different host-derived ligands that elicit strong FPR responses, though they have different biological implications. Misfolded proteins implicated in a variety of pathologies are one class of host-derived proteins eliciting FPR activation. Amyloid β-42 (Aβ-42), a peptide fragment well-documented in Alzheimer's Disease (AD), interacts with FPR2. Additionally, part of FPR2's role includes interaction with the Macrophage Receptor with Collagenous Structure (MARCO) scavenger receptor, which is responsible for reducing inflammation and alleviating inflammation-associated symptoms [60]. Interestingly, the symptoms that are viewed as a hallmark of AD are due to this same ligand binding an entirely different group of receptors which do not internalize them, therefore leading to an increased inflammatory pathology [61]. The prion peptide fragment, PrP106-126 also interacts with FPR2 on astrocytes and microglia, and the internalization of this peptide is detrimental to the host and contributes to disease progression [62]. Another neuropeptide that activates FPR2—though other studies claim it also activates FPR1—is the pituitary adenylate cyclase-activating polypeptide 27 (PACAP27), which has been shown to induce migration and Ca2<sup>+</sup> mobilization, as well as upregulation of CD11b in neutrophils [63,64]. Another FPR agonist, the Vasoactive Intestinal Peptide (VIP), activates monocytes via FPR2 and may initiate an inflammation-resolving process [64,65].

Other host-derived FPR ligands are CKβ8-1 and the SHAAGtide sequence, as well as various uPAR domains from the Urokinase-Type 1 Plasminogen Activator Receptor (uPAR). CKβ8-1 is also known as the CCL23 chemokine, and it acts as an agonist for FPR2, along with its truncated N-terminal peptide called the SHAAGtide sequence that activates a chemokine GPCR [66,67]. It has been demonstrated that several uPAR peptides elicit FPR responses, including uPAR88-92, uPAR84-95, D2D3, and the SRSRYp sequence [68–70]. These sequences may foster the transition between fibroblasts to myofibroblasts, therefore increasing the pathology of fibrosis via FPR2 activation [68]. Another host-derived peptide, F2L, is derived from the N-terminus of the heme-binding protein, HEPB1. As the sole agonist specific for FPR3, F2L activates macrophages and possibly dendritic cells (DCs) as well [71]. However, other studies have demonstrated that FPR2 in neutrophils also exhibits a moderate affinity for F2L, though in this scenario F2L appears to have an inhibitory rather than a stimulatory effect [72,73]. A newer FPR ligand is Family with Sequence Similarity 3 (Member D), or FAM3D. This chemokine-like peptide is most highly expressed in the GI tract, though it is also expressed in cells of the immune system [8,74]. FAM3D has demonstrated a high affinity for both FPR1 and FPR2 and has been implicated in playing an important role in both inflammation and GI homeostasis via FPR activation [75]. Additional studies have found that FAM3D may also be involved in the beneficial role of glucagon secretion in Type 2 diabetes, as well as the detrimental development of abdominal aortic aneurysms [76,77].

In addition to endogenous peptides associated with cell-surface proteins and functional units, there are many ligands that are secreted by cells in response to tissue damage. Annexin-1 (AnxA1), also called Lipocortin-1, is an anti-inflammatory protein which is upregulated as a result of the stress responses of multiple host systems [78]. Some studies demonstrate AnxA1 to be an FPR1 agonist, while others show it as an FPR2 agonist; hence, it likely activates both. One study demonstrated its role in the attenuation of rheumatoid arthritis symptoms by decreasing fibroblast-like synoviocyte proliferation via FPR2 [79]. However, another study showed that AnxA1 initiated autocrine signaling in breast cancer via FPR1 and led to an increase in tumor growth and metastasis [80]. Additionally, the absence of AnxA1 has been tied to increased disease severity in both rheumatoid arthritis and obstructive pulmonary disease, leading to the hypothesis that treatment with exogenous AnxA1 may help reduce symptoms associated with different inflammatory pathologies [81,82]. In addition to AnxA1, multiple derivatives of the parent protein, including Ac1-25, Ac2-26, and Ac9-25, activate FPR2 [83–85]. These peptides have protective effects in ischemia-induced lung injury and atherosclerosis [84,86]. As with its pro-survival property, AnxA1 acts as a double-edged sword: secretion of either AnxA1 or Ac2-26 by tumor-associated fibroblasts induces the acquisition of stem-like features in prostate cancer cells, thus leading to a worse prognosis [87].

Serum-amyloid alpha (SAA) is an endogenous FPR2 agonist secreted by liver or macrophages in response to inflammatory stress and, more notably, tissue damage. In endothelial cells, SAA enhances the expression and activity of Tissue Factor—a protein necessary for clotting and wound repair—while additionally inhibiting the activity of Tissue Factor Pathway Inhibitor. Both functions were demonstrated to be the result of FPR2 activation [88]. SAA, via FPR2 activation, additionally increases the production of the wound-healing chemokine, CCL2, by vascular endothelial cells [89]. More recent studies have demonstrated the role of an SAA-FPR2 axis in neovascularization in the cornea as well [90,91].

Human LL-37 is an antimicrobial peptide that induces Cxcl13 and Tnfsf13b transcription, as well as B cell activation and proliferation via FPR2. It also contributes to the maintenance of B-cell germinal centers in Peyer's Patches of the gut [92]. LL-37 also promotes the growth of both colorectal and ovarian cancer cells [93,94]. The murine homologue of LL-37, Cathelicidin-related antimicrobial peptide (CRAMP), is similarly an Fpr2 agonist and has been shown to promote atherosclerosis and DC maturation [95,96]. CRAMP also plays a pivotal role in maintaining the homeostasis of the colon mucosa and microbiota balance, demonstrating its potential as a therapeutic molecule [97]. The list of host-derived FPR (Fpr) ligands is shown in Table 4.


**Table 4.** Host-derived FPR (Fpr) ligands and their classification.

#### **5. Synthetic Peptides and Non-Peptide Small Molecules**

By far the most extensive category of FPR ligands includes the synthetic and small-molecule ligands, of which there are over 40 currently known, as shown in Table 5. W-peptides are among the better-known synthetic peptides acting as FPR agonists, and they include the sequences WKYMVm-NH2, WKYMVM-NH2, as well as many derivatives. Bufe et al. [27] demonstrated that both FPR1 and FPR2 mediate Ca2<sup>+</sup> mobilization responses in leukocytes to more than 20 different combinations and derivatives of the W-peptide. A breakdown of the data suggests that certain residues in the peptide sequence are more important than others: C3 tyrosine, C4 methionine, and C6 D-methionine are all required for ligand activity, as is the carboxy-terminal NH2. Peptides may additionally be shortened on the N-terminus by two amino acids or elongated by three amino acids before FPR activation capacity is severely diminished. While the applications of WKYMV-sequences have been little explored, one recent study showed that the activation of FPR2 by WKYMVm may enhance the homing of endothelial cells, thus improving tissue healing, especially in ischemic neovasculogenesis in injured limbs [3]. Interestingly, another study showed that WKYMVm was capable of desensitizing HIV coreceptors CXCR4 and CCR5, therefore decreasing the entry of HIV-1 into macrophages and CD4+ T cells [47].

M-peptides are another subclass of synthetic/peptide library isolates and include MMK-1 and MMWLL. MMK-1, an FPR2 agonist, is by far the more commonly used peptide, and studies have shown that it may be useful as an anti-anxiety drug, as well as a drug to counteract hair loss from chemotherapy [98,104]. However, there has been some concern about its use in certain drug regimens, as it may amplify the response of monocytes to SIO2-coated nanoparticles, making it an important player in calculating the proper dosing when using such nanoparticles [105]. MMWLL is another M-peptide specific for FPR1. It is not classically a formylated peptide, though the addition of a formylated methionine induces a more potent FPR1 response than even the classic prototypic fMLF [106]. A much newer class of synthetic peptides are the FPR1-agonistic AApeptides, based on the general structure of N-acylated-N-aminoethyl amino acid residues. There are three different AApeptide subgroups called the α-peptides, α-AApeptides, and γ-AApeptide, all of which have different R-groups at the designated α or γ position. Most of these derivatives of AApeptides induce Ca2<sup>+</sup> mobilization in rat basophil leukemia (RBL) cells transfected with human FPR1, though the γ-AApeptide Compound 7 at 10 μM elicited a more potent cell response than fMLF at the same concentration, making it a reasonably high-affinity ligand for FPR1 at this concentration, though not at lower concentrations [107]. See Table 5 for all synthetic peptides.


**Table 5.** Synthetic peptide ligands for FPRs (Fprs).

Some newer, non-peptide synthetic compounds being studied are various derivatives of ureidopropanamide molecules. They have demonstrated a capacity for protection against LPS-induced microglial death via an FPR2-dependent pathway, and pre-treatment with concentrations as low as 1 μM showed the protective effects [24]. Thus, these ligands show promising potential as treatment for diseases associated with inflammation in the Central Nervous System (CNS). Another pro-resolving synthetic ligand is the quinazolinone derivative Quin-C1. It is an Fpr2 agonist and has been shown to be effective at reducing inflammatory cytokines and clearing neutrophils and lymphocytes in murine models of lung injury [114]. Schepetkin et al. [115] demonstrated that three different synthetic molecules are agonists for both FPR1 and FPR2. Two of these are bombesin-related BB1/BB2 antagonists called PD168368 and PD176252, and they induce Ca2<sup>+</sup> flux as well as neutrophil degranulation with EC50 values in the nanomolar range, thus making them potent FPR agonists. The third agonist is the Cholecystokinin-1 receptor agonist A-71623, which exhibits FPR1 and FPR2 agonism, though with a much higher EC50. In structural studies, these ligands were cross-reactive with FPR1/2 and possessed both Trp and N-phenylurea moieties. This led to the hypothesis that the combination of moieties greatly increases the chance that an agonist will activate both receptors.

Kirpotina et al. [116] screened over 6000 compounds and isolated nearly 30 different FPR1 and/or FPR2 agonists, all of which are derivatives of acetohydrazide, 2-(N-piperazinyl)acetamide, N'-phenylurea, and benzimidazole. The acetohydrazide derivatives (compounds AG-07/7, AG-09/92, AG-09/96, AG-09/101, and AG-09/102) and N-phenylurea derivatives (AG-09/3, AG-09/4, AG-09/73 through AG-09/77, and AG-09/82) are all FPR2-specific, though the acetohydrazide compounds tend to have lower efficacies on average. Also, the benzimidazole derivatives are either FPR1-specific (AG-09/1, AG-09/2, AG-09/13, AG-09/18, AG-09/19, and AG-09/21) or are agonists for both FPR1 and FPR2 (AG-09/16, AG-09/17, AG-09/20, and AG-09/22 through AG-09/24); no FPR2-specific benzimidazole derivatives have yet been identified. Pyridazines are another class of non-peptide, synthetic molecules which can have many different derivatives. Currently, only two compounds have been identified as potent mixed FPR1 and FPR2 agonists with an EC50 of around 2 μM each. These two compounds are referred to as compounds 8b and 8c with the Pyridazin-3(2H)-one structure. Additionally, they have R substitutions of SCH3 and OCH3, as well as R1 substitutions of I and SCH3, respectively. Both R and R1 substitutions are on substituted benzene rings [117]. (See Table 6 for the list of synthetic/small molecule non-peptide agonists).


**Table 6.** Small/molecule compounds functioning as FPR (Fpr) agonists.


#### **Table 6.** *Cont*.

#### **6. Ligands from Non-Human Sources**

As the search for disease treatments continues, many investigators have turned to developing compounds isolated from various plants and animals for potential therapeutic uses. Some of these new compounds have been shown to activate human or murine FPRs. The first of these is a series of compounds isolated from the centipede *Scolopendra subspinipes mutilans*, which has classically been used in Oriental medicine and is now being studied for therapeutic potential [126]. New studies show that compounds Scolopendrasin III and V both cause human neutrophil migration via FPR1, while Scolopendrasin IX seems to work through FPR2 to promote neutrophil chemotaxis. The two former compounds have not yet been studied for effectiveness against particular pathologies, though Compound IX has traditionally been an effective treatment for rheumatoid arthritis in Oriental medicine. New evidence confirms this activity, citing the activation of FPR2 as the mechanism [9,127]. Temporins are another class of FPR agonists and consist of antimicrobial peptides isolated from the *Rana Temoraria* frogs. Temporin A and Rana-6 are two such peptides, and both activate FPR2 to promotion leukocyte migration. There are also two distinct synthetic peptides, I4S10-C and I4G10-C, that are modeled after temporins and activate FPR2 [128].

Rubimetide is a peptide (Met-Arg-Trp) isolated from the digest of Rubisco in spinach. While it has been studied for some time, it has just recently been classified as an FPR2 agonist and has further demonstrated an ability to produce anxiolytic-like effects, thus alleviating some symptoms associated with anxiety [98]. The same investigators also isolated soymetide from the α' subunit of β-conglycinin from soybeans and then demonstrated its activity as an FPR1 agonist [129]. Furthermore, the antimicrobial peptides Piscidin-1 and -3, which were isolated from fish, have been shown to induce myeloid cell chemotaxis via both FPR1 and FPR2. As a testament to the harms of water pollution by metals, this study also demonstrated that conjugation of Cu2<sup>+</sup> with either of these compounds reduces the chemotactic activity of mammalian neutrophils [130] (See Table 7 for the list of peptides from other non-human sources).


**Table 7.** Non-human, non-microbe-derived FPR (Fpr) ligands.

#### **7. Concluding Remarks**

FPRs are a class of seven-transmembrane, G-protein-coupled receptors (GPCRs) that interact with a remarkably diverse range of ligands. As demonstrated, these ligands may originate from pathogens, the host, the synthetic peptide or compound library, or even non-host multicellular organisms. With such diverse agonist binding capacity, it is not surprising that FPRs may be either detrimental or beneficial in different pathophysiological conditions. Though the majority of these agonists have been known for more than a decade, newer studies are finding novel roles for these ligands in treatments for conditions ranging from anxiety and mental health disorders to arthritis and wounds. The field of FPR agonist studies has demonstrated the potential of these molecules to have therapeutic mechanisms useful for medicine. In addition to the vast number of agonists summarized here, there are also extensive lists of antagonistic ligands that may also provide protective mechanisms in various diseases [23,44]. Thus, further exploration of FPRs and ligands as therapeutic targets would be highly beneficial to diseases including cancer, sceptic shock, arthritis, and many other inflammatory pathologies.

**Funding:** This project has been funded in part by Federal funds from the National Cancer Institute (NCI), National Institutes of Health (NIH), under Contract No. HSN261200800001E, and is also supported in part by the Intramural Research Program of NCI, NIH.

**Acknowledgments:** The authors thank Cheri Rhoderick for secretarial assistance.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


6-methyl-2,4-disubstituted pyridazine-3(2H)-ones as potent N-formyl peptide receptor agonists. *Bioorg. Med. Chem.* **2012**, *20*, 3781–3792. [CrossRef] [PubMed]


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Humanized Mice as an E**ff**ective Evaluation System for Peptide Vaccines and Immune Checkpoint Inhibitors**

**Yoshie Kametani 1,2,\*, Yusuke Ohno 1, Shino Ohshima 1, Banri Tsuda 3, Atsushi Yasuda 4, Toshiro Seki 4, Ryoji Ito <sup>5</sup> and Yutaka Tokuda <sup>3</sup>**


Received: 5 November 2019; Accepted: 12 December 2019; Published: 16 December 2019

**Abstract:** Peptide vaccination was developed for the prevention and therapy of acute and chronic infectious diseases and cancer. However, vaccine development is challenging, because the patient immune system requires the appropriate human leukocyte antigen (HLA) recognition with the peptide. Moreover, antigens sometimes induce a low response, even if the peptide is presented by antigen-presenting cells and T cells recognize it. This is because the patient immunity is dampened or restricted by environmental factors. Even if the immune system responds appropriately, newly-developed immune checkpoint inhibitors (ICIs), which are used to increase the immune response against cancer, make the immune environment more complex. The ICIs may activate T cells, although the ratio of responsive patients is not high. However, the vaccine may induce some immune adverse effects in the presence of ICIs. Therefore, a system is needed to predict such risks. Humanized mouse systems possessing human immune cells have been developed to examine human immunity in vivo. One of the systems which uses transplanted human peripheral blood mononuclear cells (PBMCs) may become a new diagnosis strategy. Various humanized mouse systems are being developed and will become good tools for the prediction of antibody response and immune adverse effects.

**Keywords:** peptide vaccine; immune checkpoint inhibitor; humanized mouse; cancer antigen; immune suppression

#### **1. Introduction**

Peptide vaccines are widely accepted as a promising strategy to fight infectious disease and cancer. However, the efficacy of a peptide vaccine depends not only on the antigen presentation through antigen-presenting cells but also on the immune environment of each patient, since the immunity of patients with chronic infectious disease and/or cancers tend to be dampened. Therefore, to achieve a more personalized medicine, we need a more detailed diagnosis before treatment. We propose

the use of the humanized mouse system established through transplanting human peripheral blood mononuclear cells (PBMCs) from a patient into an immunodeficient mouse, for the evaluation of the response to peptide vaccines and other reagents which influence patient immunity. We also describe the immune condition artificially induced by immune checkpoint inhibitors (ICIs) [1] and the reagents against immune-related adverse events (irAEs), followed by the current state-of-the-art advances of humanized mouse systems and the issues to overcome. Moreover, we will discuss whether it is possible to evaluate the patient immunity by using second-generation humanized mice.

#### **2. Di**ffi**culties in the Development of Peptide Vaccines**

The design of peptide vaccines relies on the potential of peptides to bind to the major histocompatibility complex (MHC) in order to be presented by antigen-presenting cells (APCs), such as dendritic cells (DC) and macrophages. However, the MHC binding affinity is not enough to predict the activation of immunity, because the immune condition is different among different patients. Therefore, the decrease in the immune competence should be evaluated when the vaccine is adopted for patients with cancer and/or affected by a chronic infection. The vaccine is not restricted to be used as an anticancer agent; it also includes the influenza vaccine, to be administered to cancer patients [2–4]. Moreover, if the immune checkpoint inhibitors (ICIs) are used for the purpose of immune activation, the situation becomes more complex. We discuss the factors in detail below.

#### *2.1. Selection of the Adequate Peptide for Vaccination*

Vaccines are categorized as preventive or therapeutic based on their function and are further classified into virus, peptide, DNA, or DC vaccines, depending on the antigen source. Various types of antigens and adjuvants have been developed and evaluated for vaccination against infectious diseases. The design of the peptide antigen is important for inducing the most effective output with each type of vaccine, as each pathogen has a unique strategy for infection and proliferation. However, for long-lasting memory production, protein/peptide-based antigens are essential because the memory requires the activation of T cells through antigen-presenting cells, such as DC and macrophages. On the other hand, antigens need to activate B cells by crosslinking B-cell receptors (surface Igs). Therefore, the antigen epitope should be exposed to the hydrophilic surface by protruding into the aqueous solution and, thus, being recognized by B cells in vivo.

For the vaccine components to activate T cells, the antigens should at least contain a highly immunogenic peptide with more than 8, and up to 30, residues which can be further presented by the patient MHC (class I for cytotoxic T-cell activation and class II for antibody production). Moreover, as the peptide sequence mutates easily within the virus, it should be selected to maintain the peptide primary structure. The peptide presentation is predicted for HLA and mouse MHC by using available algorithms [5–7]. However, the prediction is incomplete because more new HLA types have been reported [8–10], and even if a peptide is successfully presented by mouse MHC in an experimental design, it does not imply that the same peptide will be presented on HLA. Therefore, larger peptide antigens are typically used in order to include as many epitopes as possible to be presented by major HLAs.

The evaluation of adjuvants is also very important. The induction of inflammation by the adjuvant is effective for the enhancement of the immune response. However, inflammation induction may pose a risk and result in adverse effects for patients. Therefore, self-adjuvanting techniques have been developed for clinical use [11–14]. Among them, the conjugation of molecules related to the ligands of tool-like receptors (TLR) to target peptides may be a safe and effective vaccine adjuvant. The DNA vaccines now, on translational research, use genes of TLR-related molecules.

#### *2.2. Antigens which Enable Activation of the Patient Immune System*

While vaccination is the most effective strategy to prevent acute infectious diseases caused by bacteria and viruses, it is not easy to develop effective vaccines against cancer and chronic infectious

diseases. Similarly, to antigens present in pathogenic bacteria and viruses, patients with cancer present tumor-associated antigens (TAA) with high antigenicity and immunogenicity. TAAs are classified into differentiation, tissue-specific, mutated, and overexpressed antigens [15]. The U.S. Food and Drug Administration (FDA) has already approved for clinical use several cancer vaccines based on TAAs [16,17]. Hepatitis B virus (HBV) and human papilloma virus (HPV) are examples of TAA-based vaccines [18]. There are also unique classic vaccines like sipuleucel-T, the first therapeutic cancer vaccine approved by the FDA [19]. Moreover, many cancer vaccine candidates are currently under investigation in clinical trials, including nucleic acid-containing liposomes and nanoparticles (DNA vaccines) and gp100 peptide (peptide vaccines) [20–22].

On the other hand, especially for tumor-associated peptide vaccines, even if the antigen presentation is satisfied, it is difficult to activate the patient immune system. In spite of the extensive development of vaccines which may induce an anticancer immune response in patients, this response may vary among patients, making the vaccine not always effective. The immune-reactive tumors are called hot tumors, whereas nonimmune reactive tumors are referred to as cold tumors. Hot tumors are thought to have much more cell mutations compared to cold tumors, suggesting that hot tumors have many more TAAs [23]. Accordingly, the hot tumor, which is immune-reactive for the patient, may become the target of peptide vaccines, whereas, in the case of nonreactive cold tumors, the peptide vaccine might be ineffective. In hot tumors, there are some antigens that are highly expressed because of their overexpression on cancer cells. Human epidermal growth factor receptor 2 (HER2) is an example of a TAA molecule, as HER2 is overexpressed on the tumors of patients with breast cancer, and the specific antibody Herceptin is very effective for suppressing cancer progression. Due to the success of antibody reagents, many other human antitumor IgGs have been developed, and their mechanism of action has been investigated [24]. However, the antitumor effect does not last long enough, and the mechanism underlying this effect has not been fully elucidated.

Another problem in the development of cancer vaccines is the incomplete prediction resulting from the algorithms used. Our immune system rejects self-antigen-reactive clones, which may contain cancer-specific clones. Therefore, many of the predicted peptides cannot induce the desired immune response, even though the peptide leads to an immune response in experimental animals. Even if the peptide functions as an antigen, cancer cells have heterogenous mutations in the tumor mass, and, thus, the reactivity of each cancer cell is predicted to be diverse. Therefore, a complete rejection of the cancer cells within the tumor mass is difficult if simply one TAA is selected as peptide antigen.

#### *2.3. Immune Suppression in Patients Prevents the E*ff*ectiveness of Vaccines*

The most important challenge in the design of a vaccine is the immune suppression caused by the patient. The levels of cytotoxic T-cell activation, antibody production, and productive inflammation are different among patients with cancer. Therefore, we cannot predict the patient immune response, even if the peptide vaccine induces an immune response that is similar to the one produced by a viral infection in a healthy individual.

Therefore, although peptide vaccines have been extensively developed, the effect of the anticancer peptide vaccine is very limited, even if the peptide is presented by class I HLA on the patient DCs and the beneficial effect remains, as reviewed by Wong et al. [25]. One of the reasons for this limited effect is that cancer cells are originally "self", and the immune response is basically suppressed by clonal deletion or regulatory immune cell reactions, even though the peptide-reactive CD8 T cells are often detected in the patient PBMC. Even if mutations occur, most of them are limited to a very small region, and the peptides recognized as "non-self" might be very few or suppressed. This mechanism is present in cold tumors.

Meanwhile, an autoimmune disease might be induced by the suppression of peripheral tolerance. The neutrophil extracellular traps (NETs) play a role in the development of autoimmunity [26,27]. NETs are networks of extracellular fibers that are primarily composed of DNA from neutrophils, which suppress the movement of pathogens. Neutrophils release granule proteins, together with

chromatins, and form an extracellular fibril matrix of NETs. The autoantigens involved in neutrophil granular proteins contain very common proteins, such as actin and histones. The proteins vary with the stimulation, and they occasionally induce an autoimmune response. It is important to understand which condition determines if the immune system will or will not induce an autoimmune disease. Moreover, not only cancers, but also some pathogens, induce tolerance. Actually, immature DCs, which induce only an MHC-TCR signal, may induce anergy to self-reactive and non-self-reactive T cells [28].

#### **3. Immune Checkpoint Inhibitors and Reagents for Side-E**ff**ect Regulation**

Recently, adaptive immune-resistant tumor cells which express the programmed-death-L1 (PD-L1) antigen were reported in melanomas by Abiko et al. [29] and Taube et al. [30]. According to their reports, PD-L1 is largely induced on the local tumor cells by tumor-infiltrating lymphocytes (TILs)-derived IFN-γ because IFN-γ is the most potent inducer of PD-L1in inflammatory cytokines. Upregulation of PD-L1 by IFN-γ has been extensively described in various cell types [31–37]. Similarly, TNF-α, another pro-inflammatory cytokine, also upregulates PD-L1 expression via TNF-α-NF-κB pathway [38–40]. TNF-α is reported to synergistically act with IFN-γ to induce PD-L1 expression at both mRNA and protein levels. IFN-γ enhances the resistance of the adaptive immune response by PD-L1 induction in hepatocellular carcinoma cells which upregulate the expression of IFN-γ receptors [41]. PD-L1 is expressed not only in all hematopoietic cells but also in many non-hematopoietic cell types, such as endothelial and epithelial cells [42,43]. In contrast, PD-L2 expression is more restricted to professional antigen-presenting cells, such as DCs, B cells, and monocytes/macrophages. Besides PD-1, there are other known interacting partners for PD-L1 and PD-L2. PD-L1 also binds to CD80, whereas PD-L2 uses repulsive guidance molecule (RGM) domain family member B (RGMB) as an alternative binding partner. Both types of interaction also inhibit immune responses [44,45].

#### *3.1. Patients with Cancer*

Recently, the anticancer effect of various immune checkpoint antibodies was elucidated [46]. The "immune checkpoint antibody" induces the blockage of continuous T-cell activation in the periphery. PD-1 antigen is expressed on the long-lived activated T cells, exhausted T cells, and the follicular helper T cells (Tfh) [47,48]. Normally, PD-L1 is expressed on antigen-presenting cells and germinal center B cells [49,50]. Apoptosis is induced when the PD-1-expressing T cells encounter the PD-L1-expressing APCs [49]. When the PD-1/PD-L1 interaction is inhibited by the anti-PD-1 antibody, T cells survive, and the anticancer effect is prolonged. Other immune checkpoint molecules, such as CTLA-4, PD-1, TIM-3, and LAG-3, have been reported, and the ability of the antibodies against such immune checkpoint molecules is being evaluated as anticancer products [51,52]. The effect is remarkable, but the response is still limited to a fraction of patients with cancer. The effect is ordinary, not long-lasting, and the combination of these inhibitors and other anticancer drugs are under investigation.

Moreover, antibodies are so expensive that, before using them as therapeutic agents, a strategy is needed to distinguish among patients that are responsive to the treatment from those that are not. Many biomarkers have been reported to predict the efficacy of the treatment. However, the heterogeneity of tumor masses and the variety of antibodies available make it difficult to find such predictive biomarkers, and even PD-L1 expression might not be a promising marker. Collectively, many studies have suggested that PD-L1 expression on melanoma cells can represent a biomarker to test for the efficacy of anti-PD1 and related antibodies, such as Nivolumab, Ipilimumab, and Pembrolizumab [53–55], and other immune checkpoint inhibitors; however the PD-L1 expression is not always an effective marker for patients with cancer in other clinical trials [56,57]. For example, PD-L1 expression on melanoma cells in pretreatment tumor biopsy samples is reported to correlate with response rate, progression free survival, and overall survival in patients with advanced melanoma treated with anti-PD1 antibodies [55], but these antibodies are also effective for PD-L1-negative patients [57].

While the benefits of assessing PD-L1 expression on melanoma cells to predict the clinical outcomes of ICI.

It is already defined. treatment have been suggested, as above, there are still no common criteria of diagnosis. This fact limits the clinical usefulness of the diagnosis of PD-L1 expression, because the low sensitivity of immunohistochemical (IHC) assays using different antibody clones makes it difficult to establish staining platforms and scoring systems [54,55,57–59]. To avoid misprediction by IHC staining, Conroy et al. assessed the expression of PD-L1, using next-generation RNA sequencing, but the sensitivity of their system resembles that of IHC assay systems and is, in addition, more expensive [58]. Additional assays or completely different assay systems will be needed in the future to diagnose PD-L1 expression of patient cancer tissues, for the prediction of clinical outcomes for the ICI treatment of melanoma [60].

#### *3.2. Patients with Infectious Diseases*

Viral infections do not always enhance PD-L1 expression, because similar PD-L1 levels are detected in individuals not infected with viruses [61–64]. Increased PD-L1 levels are related to specific viruses, such as the following: Epstein–Barr virus (EBV) [65–68], hepatitis B virus (HBV) [69–71], hepatitis C virus (HCV) [72–75], human immunodeficiency virus (HIV) [63,76–79], human papilloma virus (HPV) [68,80–83], Merkel cell polyomavirus (MCPyV) [84], bovine leukemia virus (BLV) [85], and Kaposi sarcoma-associated herpes virus (KSHV) [86]. The pathobiological mechanisms by which viruses trigger the expression of PD-L1 have been elucidated. Pathogen-associated molecular patterns (PAMPs) such as lipopolysaccharides (LPS), double-stranded RNA, and non-methylated CpG, from virus, bacteria, and fungi, activate toll-like receptors (TLRs) to induce the immune response and protect the host against the infection. Therefore, the effect of PD-1/PD-L1 blockage by ICI might not be limited to blocking cancer-T-cell interaction. Other hematopoietic lineage cells expressing PD-1 and/or PD-L1 might also be affected. For example, a fraction of plasmablasts and regulatory B (Breg) cells also express PD-L1 [87,88]. Therefore, the blockage of the axis may affect the humoral immunity or Breg cells. However, the antigen-specific reaction in such a systemic immunity is difficult to analyze in vivo.

#### *3.3. Steroid Hormones and ICI Side E*ff*ects*

Glucocorticoids are a class of steroid hormones that are powerful immune-suppressants that produce an effect on the systemic immune response. Conditions such as pregnancy and chronic inflammation may induce glucocorticoid secretion. Glucocorticoids [89] secreted by the stimulation of chronic inflammation are widely used as anti-inflammatory drugs. While they induce various signals related to cytokine and Fc receptors that modify metabolism and immune responses, it was recently reported that glucocorticoids impair upstream B-cell-receptor and Toll-like-receptor 7 signaling, reduce transcriptional output from the immunoglobulin loci, and promote significant upregulation of genes encoding the immunomodulatory cytokine IL-10 and the terminal-differentiation factor BLI MP-1 [90]. Expression of κ light chain and the two variable regions are especially suppressed. If patients affected with cancer or severe infectious diseases increase their glucocorticoid levels in order to overcome the disease-induced inflammation, or if they are treated with glucocorticoid because of the regulation of anticancer drug-induced side effects, the anticancer Ig expression might be suppressed. If the inflammatory, glucocorticoid-abundant condition continues, the potential for antibody production in the patient may be dampened. Therefore, if the PBMC of patients is examined for the antibody-production response, we may be able to predict if the patient is exposed to such steroid-based immune suppression. Glucocorticoids have also been reported to enhance metastasis in breast cancer [91]; therefore, their effect on patients with cancer needs to be examined in more detail.

On the other hand, it has been reported that ICI treatments occasionally induce a typical side effect related to pituitary dysfunction. Notably, hypophysitis, a previously very rare disease, has emerged as a distinctive side effect of ipilimumab and occasionally of nivolumab [92]. These side effects are not limited only to the pituitary; they also affect the thyroid, adrenal glands, and other downstream-target organs [93].

#### **4. Humanized Mouse Models for the Evaluation of the Human Immune Environment**

As we mentioned above, the prediction of the protective immunity development by vaccination is difficult because the immune condition is diverse in each patient, and the appropriate ICIs and induced irAEs may not be predicted. In order to determine the protocol reflecting the immune condition of each patient, the so-called personalized medicine, a humanized mouse system reconstituted with the patient immunity, may be useful [94]. The immunization with vaccines may reveal not only the effect of a specific vaccine, but it may also provide information regarding the patient immune response to mimic the anticancer/pathogen response. The current status of the humanized mouse system involving next generation humanized mice and its limitations is shown in Figure 1 (cellular immunity) and discussed below [95,96].

**Figure 1.** Three strategies for the reconstitution of human immunity in the immunodeficient mouse. The transplanted tissues are HSC, Lymphoid tissues or the fragments of mnewborn, and PBMCs. Many kinds of antigens and pathogens were used for the analysis.

#### *4.1. Humanized Mice for Reconstitution of the Human Immune System with Hematopoietic Cells*

The humanized mouse system was originally developed to evaluate the multipotency of human HSCs or progenitors. Severely immunodeficient mouse strains, as well as the transplantation techniques, have recently been developed [97–100]. After the discovery of the nonobese diabetic severe combined immunodeficient mouse (NOD-scid) model and its derivatives, transplantation of human hematopoietic stem cells (HSCs) into these mice led to the development of human lymphocytes and myeloid cells which, are localized in the primary and secondary lymphoid tissues of the mouse [101–103]. These mouse models have been used to analyze the differentiation of human hematopoietic and leukemic stem cells [104]. On the other hand, because of the success of humanized monoclonal antibody reagents such

as trastuzumab and rituximab, completely human-type antibody production has also been attempted, using these mouse models transplanted with various types of human hematopoietic cells [105].

NOD/Shi-scid-IL2Rγnull (NOG), developed at the Central Institute for Experimental Animals, and NOD scid gamma (NSG), developed at the Jackson Laboratory, are two representatives of severely immunodeficient mouse strains. Both mouse strains have a deficiency in IL-2rgc [97,106–108]. NOG mice possess a truncated IL-2rgc, and NSG mice have a complete deletion of the gene coding for IL-2rgc; the efficiency of the engraftment and the differentiation efficiency are comparable in the two strains. Both of them enabled the development of human T and B cells from human HSCs in a xenogenic environment. However, most of the human B cells differentiated in the mice expressed CD5, a marker of B1 cells, and the specific IgG antibody is not produced [109–113] (Table 1). We reconstructed human immunity in NOG mice transplanted with HSC and immunized with CH401MAP, a specific HER2 peptide antigen for patients with breast cancer, and keyhole limpet hemocyanin (KLH), or toxic shock syndrome toxin-1 (TSST-1), with Freund's complete adjuvant and measured the specific antibody titer by ELISA. As a result, although antigen-specific IgM and nonspecific IgG were detected in the sera, antigen-specific IgG was not detected in mice (Table 1) [114–116]. These mice did not develop a germinal center, which has a structure composed of T, B, and follicular DCs and plays a crucial role in highly specific crass-switched IgG antibody production. The results indicated that human T cells and B cells developed in the mouse environment could not induce cognate interaction, because the T cells are selected for mouse MHC in the thymus.


**1.**Humanized mice with antigen-specific antibody production.

**Table** 

Representativeimmune-humanizedmousesystemswhichinducedantibodyproductionareshown.ThedataarebasedonPubMed,publishedfrom

 1988 to 2019.

After the first trial with NOG and NSG mouse models, the animals with mouse MHC knockout and HLA transgenic antigen were developed to induce cognate interaction of T cells and B cells. Among them, HLA class I transgenic mice evoked antigen-specific cytotoxic T-cell response against HSV virion protein peptide [128] or WT1 peptide [129]. The success of the reconstitution of human cellular immune response was followed by an adoptive transfer therapy model using the humanized-mouse system [130]. Consequently, the established patient-derived xenograft (PDX) system, which transplants a patient's cancer tissues (minimal standard was reported by Meehan et al. [131]), combined with a patient's T cells, is widely accepted. The detail was intensively reviewed by other researchers [132–134].

On the other hand, the response of HLA class II transgenic mice did not completely mimic the human humoral immunity [119,120,125]. Moreover, mice need to be transplanted with the same HLA-bearing human HSCs, which restrict the samples to be examined. Among them, Ashizawa et al. reported that class I and class II MHC KO NOG mice (NOG dKO) transplanted with human PBMC and tumor cell lines showed higher anticancer effects after PD-1 antibody treatment [135]. In these mouse strains, transplanted tumor cells and immune cells can be engrafted, and the anticancer effect of human immune cells can be observed (reviewed by Chen et al. [95]). The mouse system had an advantage, which is that the restriction of HLA type could be avoided by using PBMC, which contain the same patient's T cells and antigen-presenting cells. However, they did not detect anticancer antibody production in this study.

Currently, various transgenic mouse strains expressing human cytokines and surface antigens, along with more severely immunodeficient mouse strains, are being developed to transplant human hematopoietic cells (HSC or PBMC). The category of newly-established mouse system includes myeloid cell development, cancer immunotherapy model, allergy model, and graft-versus-host disease (GVHD) model.

Another humanized mouse model, called BLT mice, has been reported. In this mouse model, immunodeficient mice are co-transplanted with human fetal liver and thymus tissues, along with autologous CD34 + HSCs. This mouse system is a modification of the SCID-Hu mice developed by MacCune [113,117,123]. In these mice, antigen-specific antibody production was partially achieved, and experiments on infection with bacteria or viruses were conducted [118]. Severely immunodeficient NSG mice are used to establish NSG–BLT mice [136]. A modified NSG mice, in which Human *SCF*, *GM-CSF*, and *IL-3* genes were transduced, was used to establish an improved BLT mouse strain. Based on the NSG mouse strain, human HSCs, fetal liver, and fetal thymus were transplanted, and mice were inoculated with dengue and/or Zika virus. As a result, these mice induced a higher immune response than that of conventional NSG mice, although graft-versus-host disease (GVHD) could not be avoided [124,126,127]. However, because of a serious ethical problem, Japanese researchers are unable to establish the BLT mouse system. The BLT model system succeeded in the induction of the cytotoxic immune response with no mature humoral immunity, maybe because the cytotoxicity is too high to maintain the antibody production (discussed in [94]).

Collectively, many of the strains support the differentiation of various hematopoietic cell lineages from human HSCs. Moreover, PBMC engrafts in the mice and can reconstitute human cellular immunity. However, human humoral immune response in a mouse model still needs further improvement: it is impossible, so far, to reconstruct the immune condition involving humoral immunity of various patients.

#### *4.2. Humanized Mouse System to Evaluate Antigen-Specific Antibody Production*

It is difficult to completely develop humoral immunity in humanized mice because of the reasons exposed above. While T cell–B cell interaction needs cognate interaction, humans have a large variety of HLA types, and it is difficult to cover all the HLA types present in a patient blood. Immunodeficient mice transplanted with PBMCs are promising tools to evaluate human immune responses to vaccines, compared to the HSC-transplanting mouse system. However, these mice usually develop severe GVHD [137]. With GVHD, mice develop a large amount of activated T cells, while B cells are decreased in parallel, and there is no humoral immune response. Therefore, it is

difficult to evaluate the production of antigen-specific IgG production after antigen immunization in those mice. To evaluate antigen-specific IgG responses in PBMC-transplanted immunodeficient mice, we developed a novel NOD/Shi-scid-IL2rgnull (NOG) mouse strain that systemically expresses the human IL-4 gene (NOG-hIL-4-Tg) [116]. After human PBMC transplantation, GVHD symptoms were significantly suppressed in the Tg NOG, as compared to conventional NOG mice. In the kinetic analyses of human leukocytes, long-term engraftment of human T cells has been observed in peripheral blood of NOG-hIL-4-Tg, and then CD4+ T cells dominantly proliferated rather than CD8+ T cells. Furthermore, these CD4+ T cells produced large amounts of IL-4 but suppressed IFN-g expression, resulting in long-term suppression of GVHD. Most of the human B cells detected in the transplanted mice showed a plasmablast phenotype. Vaccination with HER2 multiple antigen peptide (CH401MAP) or keyhole limpet hemocyanin (KLH) successfully induced antigen-specific IgG production in PBMC-transplanted NOG-hIL-4-Tg. The HLA haplotype of donor PBMC might not be relevant to the ability of an antibody secretion after immunization. The reason why NOG-hIL-4-Tg retain B cells and succeeded in the specific antibody production was examined, and we found that the engrafted human lymphocytes decreased glucocorticoid receptor expression, which dampens the humoral immunity [138].

This evidence suggests that the PBMC-transplanted NOG-hIL-4-Tg mouse system is an effective tool to evaluate the production of antigen-specific IgG antibodies, following vaccination in individual cancer patients [116]. The mouse system can be used for the evaluation of the effect of ICIs on antibody production in the presence of human PBMCs, as well.

Of course, the vaccination is not limited to cancer vaccines. As plasmablasts are efficiently developed, the evaluation of vaccines against highly deleterious pathogens, such as Ebola virus, may become possible. Moreover, the donors recovered from such serious infectious disease may keep their memory B cells against the pathogen. Therefore, the transplantation of the PBMCs may develop plasma cells that secrete effective antipathogen antibodies. If we establish the technology for monoclonal antibody preparation, we may obtain the monoclonal antibody reagents for the treatment of such deleterious infectious diseases.

The humanized mouse systems discussed are summarized in Table 1.

#### **5. Future Perspectives**

Because the efficacy of the peptide vaccine is influenced by the immune-cell environment and the patient's body fluid content, we need to evaluate vaccines by constructing patient-mimicking conditions. If we can establish patient-PBMC-based check systems using the humanized mouse model for vaccination and additional reagents, we may check the vaccination efficiency, ICI, and IAEs at the same time. If those goals are achieved, they may enable a promising personalized medicine, such as in the case of the use of the mixed lymphocyte reaction for blood-type examination before transplantation. Therefore, it is urgent to develop humanized mice which reconstitute not only human immune cells but the environment of the actual patient. By using the PBMC-based humanized mouse system, various vaccines can be evaluated for their efficacy. We need to improve the humanized mouse system to fine-tune the peptide design for vaccine development.

**Author Contributions:** Conceptualization, Y.K., T.S., Y.T. and R.I.; analysis, Y.O., S.O., B.T., A.Y. and Y.K. drafted the work.

**Funding:** The NOG researchs were supported by Japan Society for the Promotion of Science by a Grant-in-Aid for Scientific Research (ITO) (S) [grant number 22220007] to MI; a Grant-in-Aid for Scientific Research (Kametani) (B) [grant number 17H03571] to YK; a Tokai University Grant-in-Aid to YK (2013–2014); and the MEXT-Supported Program for the Strategic Research Foundation at Private Universities (2012–2016).

**Acknowledgments:** We thank Yumiko Nakagawa for her excellent animal care skills. We thank the members of the Teaching and Research Support Center in the Tokai University School of Medicine for their technical skills.

**Conflicts of Interest:** We have no conflicts of interest.

#### **Abbreviations**


#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Detection of Antigen-Specific T Cells Using In Situ MHC Tetramer Staining**

**Hadia M. Abdelaal 1,2, Emily K. Cartwright <sup>1</sup> and Pamela J. Skinner 1,3,\***


Received: 1 October 2019; Accepted: 16 October 2019; Published: 18 October 2019

**Abstract:** The development of in situ major histocompatibility complex (MHC) tetramer (IST) staining to detect antigen (Ag)-specific T cells in tissues has radically revolutionized our knowledge of the local cellular immune response to viral and bacterial infections, cancers, and autoimmunity. IST combined with immunohistochemistry (IHC) enables determination of the location, abundance, and phenotype of T cells, as well as the characterization of Ag-specific T cells in a 3-dimensional space with respect to neighboring cells and specific tissue locations. In this review, we discuss the history of the development of IST combined with IHC. We describe various methods used for IST staining, including direct and indirect IST and IST performed on fresh, lightly fixed, frozen, and fresh then frozen tissue. We also describe current applications for IST in viral and bacterial infections, cancer, and autoimmunity. IST combined with IHC provides a valuable tool for studying and tracking the Ag-specific T cell immune response in tissues.

**Keywords:** T cells; In situ tetramer staining; MHC tetramer; immune response; antigen-specific; confocal microscopy; fresh tissue

#### **1. Introduction**

T cells play a pivotal role in the adaptive immune response. They perform a wide range of immune functions, including, but not limited to, providing help for B cells, protecting against intracellular and extracellular pathogens, detecting and killing cancer cells, and preventing autoimmunity [1]. T cells recognize the antigen (Ag) via a T cell receptor-cluster of differentiation 3(TCR-CD3) complex in the context of peptide-major histocompatibility complex (p-MHC) on the surface of antigen-presenting cells (APCs). Cluster of differentiation 4 (CD4)<sup>+</sup> T cells recognize antigens processed by APCs placed into a groove of MHC class II molecules (MHCII), whereas Cluster of differentiation 8 (CD8)<sup>+</sup> T cells recognize antigens presented by MHC class I molecules (MHCI). Regardless of the class of MHC, TCR: p-MHC interaction is required for initiating the T cell signaling cascade, leading to T cell activation [2].

The development of flow cytometric analysis of Ag-specific CD8<sup>+</sup> and CD4<sup>+</sup> T cells using fluorochrome-conjugated p-MHCI and p-MHCII tetramer staining, respectively, has dramatically increased our understanding of the cellular immune response [3,4]. Using flow cytometry, we are able to determine the quantity, function, and phenotype of Ag-specific T cells [5–7] and identify associations between the human leukocyte antigen (HLA) haplotype and disease progression [6–8]. Despite these important contributions, a major limitation of flow cytometry is the inability to visualize the localization of Ag-specific T cells, both with regards to their interactions with other cells, as well as their distribution within the tissue compartment. In addition, the dissociation of tissues into a single cell suspension for flow cytometry tends to underestimate the quantity of Ag-specific T cells within non-lymphoid tissues, such as the female reproductive tract (FRT), lung, and liver [9]. Thus,

while flow cytometric analysis of Ag-specific T cells is extremely valuable, it fails to determine the spatial relationships between Ag-specific T cells and target cells and underestimates their total numbers in tissue, which are crucial for a complete understanding of the cellular adaptive immune response.

We developed a method for the in situ detection of Ag-specific CD8<sup>+</sup> T cells using MHCI tetramers. Using in situ MHCI tetramer (IST) staining combined with IHC, we were able to directly visualize and quantify Ag-specific CD8<sup>+</sup> T cells and their specific location within the tissue compartment [10]. We used fresh, unfixed, lightly fixed, and frozen spleens from TCR transgenic mice. Ag-specific CD8<sup>+</sup> T cells were readily detected in the spleens from the transgenic mice, and we found that fresh tissues by far produced the best quality staining [10]. At the same time as we were performing these studies, Haanen et al. developed and used similar IST staining methods combined with IHC to detect virus-specific CD8<sup>+</sup> T cells in TCR-transgenic and wild-type virus-infected mice, as well as to detect endogenous CD8<sup>+</sup> T cells directed against epitope tagged tumor cells in mice [11].

Since this time, we and others have also developed IST methods that use p-MHCII multimers to detect Ag-specific CD4<sup>+</sup> cells in situ [12–16]. With MHC class I and class II IST technologies, we are able to determine the spatial and temporal location and abundance of Ag-specific T cell responses in tissues. By combining IST with IHC, we are able to stain Ag-specific T cells, as well as cellular markers. Additional cellular markers allow phenotypic characterization of Ag-specific T cells and the surrounding cells in the tissue, which can include target cells. We have recently produced a video demonstrating IST staining [17], and these IST staining methods have previously been reviewed [2,18–20]. This review article builds on previous reviews, incorporates new methodologies, and describes more recently developed applications.

#### **2. In Situ Tetramer Staining**

Tetramers designed for IST are the same as those used in conventional flow cytometry. They both consist of four MHC monomers loaded with a specific peptide to interact with the T cell specific to that peptide [3].

#### **3. Direct vs. Indirect IST**

The two common methods used to detect Ag-specific T cells in situ are direct and indirect IST. Direct IST requires the use of MHCI or MHCII tetramer conjugated directly to a bright fluorophore, like allophycocyanin (APC) or phycoerythrin (PE) to directly label Ag-specific T cells [11,21,22]. Direct staining can also be done by using MHC-dextran multimers (dextramers) [16,23]. These multimers have more p-MHC complexes and more fluorochromes, allowing for brighter signal than standard tetramers with only one fluorophore. In addition, Tjernlund et al. used Qdot 655 multimers to directly detected SIV-specific CD8<sup>+</sup> T cells [7,24].

Indirect IST uses antibody staining directed against the fluorophore on the tetramer to amplify the signal [10,14,17,25–36]. For example, as described in Figure 1, the four biotinylated monomers are bound to an FITC-labeled ExtrAvidin molecule (a fluorescently labeled avidin). Conjugation to this FITC-avidin molecule allows amplification of the tetramer signal using an anti-FITC antibody. In this case, tissue is labeled with FITC-conjugated MHCI tetramers, followed by incubation with rabbit- α-FITC antibodies for signal amplification. Then, a secondary antibody, such as Cy3 labeled α-rabbit IgG, further amplifies the signal. Figure 1 also shows concurrent staining with CD3 antibodies to label T cells (in blue) and CD20 antibodies to label B cells (in green). Figure 2 shows a representative image of a spleen tissue section stained indirectly with FITC-conjugated MHCI tetramers to detect virus-specific CD8<sup>+</sup> T cells (in red) and counterstained with antibodies against CD3 to label T cells (in blue) and CD20 to detect B cells (in green).

**Figure 1.** In situ major histocompatibility complex (MHC) class I (MHCI) tetramer staining combined with immunohistochemistry (IHC) to detect virus-specific CD8<sup>+</sup> T cells. Schematic diagram of in situ MHC tetramer (IST) combined with IHC to detect virus-specific CD8<sup>+</sup> T cells in fresh, unfixed tissue sections. An MHCI tetramer consists of four biotinylated MHC-class I monomers loaded with a viral peptide (or another antigenic peptide) bound to a fluorescently labeled avidin molecule. After primary incubation with MHCI tetramers, sections are fixed and then anti-FITC antibodies are used to amplify the tetramer signal. This signal is then further amplified using Cy3-tagged anti-Rabbit IgG antibodies. Sections can be counterstained with CD3 antibodies to label T cells (blue), and CD20 antibodies to label B cells (green).

**Figure 2.** IST detection of virus-specific CD8<sup>+</sup> T cells. IST Combined with IHC in spleen sections from an SIV infected rhesus macaque. Fresh unfixed spleen section was stained with *Mamu-A\*01* tetramers loaded with SIV Gag/CM9 peptides detect SIV-specific CD8<sup>+</sup> T cells (Red color), and counterstained with CD3 antibodies to label T cells blue, and CD20 antibodies to label B cells green and delineate B cell follicles. Confocal images were collected using a 20 X objective and 3 μm z-steps. (**A**) shows a montage of several projected confocal z-series fields. The scale bar = 100 μm. (**B**) shows an enlargement of the selected area in panel (**A**), which is a confocal Z-scan showing the distribution of tetramer<sup>+</sup> T cells within the spleen. The scale bar = 100 μm. (**C**–**F**) are enlargements for the selected area in panel B and shows that an SIV-specific CD8<sup>+</sup> T cell is tetramer<sup>+</sup> (**C**,**D**), CD3<sup>+</sup> (**E**), and CD20<sup>−</sup> (**F**), scale bars = 10 μm. Arrowheads point to a virus-specific CD8<sup>+</sup> T cell.

MHC tetramers conjugated to PE and APC can similarly be used for indirect staining [21,22,37–41]. In addition, antibodies directed against streptavidin can be used. For example, Vries et al. used indirect MHCI IST to detect melonoma-specific CD8<sup>+</sup> T cell following dendritic cell vaccination of

melanoma patient, where they used a rabbit anti-streptavidin that recognizes MHCI tetramer-associated streptavidin molecules. They amplified the signal from the anti-streptavidin antibodies using goat-anti-rabbit Alexa594 [42]. Another application of indirect tetramer staining involves the use of the horseradish peroxidase (HRP)-conjugated tetramer. Instead of a fluorochrome, Yang et al. used tetramers conjugated to HRP–streptavidin and amplified the signal with the addition of biotin-conjugated tyramide [21,43].

Both methods have their advantages and drawbacks. Direct staining is a simpler procedure, can result in lower background staining, and provides more options to co-label other proteins since no secondary antibody is involved in labeling TCRs. However, direct staining provides a weaker signal intensity and is relatively more expensive because it requires as much as 40 times the tetramer of the indirect staining method [18]. In contrast, indirect labeling is a multi-step procedure that is more time consuming. Indirect staining, however, yields a more intense signal, resulting in a much higher signal to noise ratio and is relatively less expensive because it requires smaller amounts of the tetramer reagents.

#### **4. IST Staining on Fresh and Frozen Tissue**

IST staining can be done on fresh tissue sections, fresh then frozen tissue, or frozen tissue sections. In situ tetramer staining is ideally performed using unfixed, fresh tissue sections to maintain the structure and mobility of TCRs to interact with p-MHC tetramers [10,11]. To generate fresh 200 μm tissue sections, either a Vibratome or Compresstome can be used. However, a Compresstome is much more efficient in generating sections and accommodates larger section sizes [25]. While fresh tissue sections are ideal, there are some circumstances where fresh samples are not feasible. For example, some studies require that samples be shipped overnight. Some studies have limited tissue sampling, size availability, or their tissue was already frozen and archived. To determine if these conditions were feasible to perform IST, we performed IST on tissue samples that were stored at 4 ◦C overnight in PBS, lightly pre-fixed or frozen [10]. We found that there was no difference in the quality of the staining that was done on either spleen sections directly after dissection or spleen sections that were stored overnight in PBS at 4 ◦C. Moreover, we found that the IST also worked on lightly fixed spleen tissue from TCR transgenic mice (defined as 2% formaldehyde or 50% methanol and 50% acetone). While the IST worked on lightly fixed tissues, it yielded a higher background and less intense signal than the fresh, unfixed tissue. Additionally, IST worked on 10 μm-thick frozen sections but also resulted in weaker signal intensity compared to that from fresh tissue section [10]. Vyth-Dreese et al. compared direct tetramer staining on fresh viable tissue sections versus cryopreserved tissue sections and was able to detect Ag-specific T cells only in viable tissue sections. However, they were able to detect Ag-specific T cells in fresh skin tissue sections that were pre-stained with tetramers and then cryopreserved [22]. Similarly, others have now successfully performed indirect IST on fresh tissue pre-incubated with tetramers, and then fixed and snap-frozen. Later, frozen sectioning was done followed by IHC and tetramer amplification [38,44].

For staining frozen sections, IST has been described on unfixed and fixed tissue samples. Fixation was done before tetramer incubation, after tetramer incubation, or fixed both before and after tetramer incubation. We performed IST on unfixed spleen tissue stored in OCT freezing medium, and fixation was done post-tetramer incubation [10]. Similarly, Oerke et al. and Tully et al. used indirect IST using PE tetramers on frozen sections, and fixation was done after tetramer incubation [40,41]. Tjernlund et al. used Qdot 655 multimers to directly stain frozen section where fixation was done post-tetramer incubation [7,24]. In addition, Yuhong et al. used indirect IST to detect *Mycobacterium tuberculosis* (*M. tb*)-specific CD4<sup>+</sup> T cells in lymph node and lung from untreated *M. tb* patients. For this study, IST was done on frozen sections that were first fixed in 4% PFA then incubated overnight with tetramer [39]. Similarly, Vries et al. lightly fixed frozen tissue sections before starting IST and fixed them again after incubation with tetramers [42].

In summary, performing IST on fresh (not frozen) unfixed tissue has several advantages compared to pre-fixed or thin frozen sections. IST with fresh tissue sections results in the highest staining intensity over background fluorescence. In addition, the use of fresh 200 μm-thick sections provide more information about the tissue because it allows the examination of 20 times more tissue than a thin 10 μm-thick frozen sections. When trying to detect a rare population of Ag-specific T cells, the more tissue examined the greater chances are of detecting rare cells. Moreover, the thick fresh sections can be examined using confocal microscopy to provide a 3-D view of the location and the interaction of Ag-specific CD8<sup>+</sup> T cells with other cells and tissue structures. On the other hand, frozen sections offer some great advantages in that they enable the detection of Ag-specific T cells in archived samples. Additionally, with frozen sections, tissue samples can be stored and processed when needed, which makes it easier to answer future questions that might arise.

#### **5. Specificity and Sensitivity of IST**

A critical and remarkable property of the cellular adaptive immune response is specificity, where selective activation and expansion of a very small population of Ag-specific T cells is required. Therefore, the specificity and sensitivity of IST is the key to its success in detecting such small fractions of Ag-specific T cells [4]. Because high background autofluorescence is inherent when imaging whole tissues, ensuring the proper negative controls for IST staining is crucial [10]. There are several methods used to confirm the specificity of tetramer staining using IST. In the interaction between the p-MHC complex and TCR, both the amino acid sequence of the peptide and the haplotype of MHC are critical for determining specificity of the T cell. Both of these variables can be altered in an experiment to ensure the tetramer staining is specific. When changing the amino acid sequence of the peptide, the same fluorescently labeled MHC molecule is used, but the peptide loaded should not be present in the experimental system. For example, in studies of *Mamu-A\*001:01* rhesus macaques, FITC-labeled *Mamu-A\*001:01* tetramers loaded with an irrelevant peptide FV10 (FLPSDYFPSV), a peptide from the hepatitis B viral core protein served as a negative control for FITC-labeled MHCI *Mamu-A\*001:01* SIV GagCM9 (181–189) (CTPYDINQM) tetramers and for FITC-labeled MHCI *Mamu-A\*001:01* SIV Tat STPESANL (SL8) tetramers [17,29,30,35]. As another example, a study using tissues from human study participants used HLA- B\*57 tetramers loaded with MART (ELAGIGILTV), a peptide from melanoma protein, to serve as a negative control for FITC-labeled HLA- B\*57 HIV-1 Gag IW9 (ISPRTLNAW) or QW9 (QASQEVKNW) tetramers during detection of HIV-specific tissue-resident CD8<sup>+</sup> T cells within the gastrointestinal tract in a chronic infection [27].

Alternatively, studies have revealed specificity by using MHC-mismatched tetramers. These are tetramers loaded with the peptide of interest but not able to bind to T cells in the tissue due to cells in the tissue expressing different MHCI molecules [10,11]. A third type of negative control includes using a tissue that does not have Ag-specific cells of interest. In this case, the tissue and tetramer are the same haplotype, but the individual animal or study participant that was sampled was not infected with microbes [10,18,21].

We found that the specificity and sensitivity of IST staining is comparable to that of flow cytometry. Following the adoptive transfer of transgenic T cells into a wild-type mouse, the spleen of the recipient mouse was split in half. One half was used to determine the number of tetramer<sup>+</sup> CD8<sup>+</sup> T cells using flow cytometry, and the other half was used for IST staining. Both techniques showed that ~1% of the CD8<sup>+</sup> cells were tetramer<sup>+</sup> [10]. Haanen et al. found that the background staining in tissues permits detection limits of 0.1–1% of T cells, whereas the limit of detection of flow cytometry is less than 0.1% [13]. Nonetheless, the sensitivity of IST is sufficient to detect endogenous antigen-specific T cell responses.

#### **6. Applications for In Situ Tetramer Staining**

As mentioned previously, traditional tissue processing for MHC tetramer staining by flow cytometry requires dissociation of the tissue into a single cell suspension. Though flow cytometry is

powerful in providing information about the phenotypes of Ag-specific cells, it does not show where specifically in the tissue they are located, what the phenotype of cells are in specific locations or what cells they are interacting within specific tissue compartments. This information can be critical for understanding T cell responses to infections, predicting vaccine efficacy, and investigating an immune response within the tumor microenvironment.

In HIV and SIV infection, CD4<sup>+</sup> T follicular helper (TFH) cells are a major site of viral persistence during antiretroviral therapy and are a critical barrier to eradication [34,45–50]. Studies using IST have shown that low levels of virus-specific CD8<sup>+</sup> T cells in lymphoid follicles permit ongoing viral replication in TFH [33,36]. This knowledge can inform future vaccine and cell therapy design for HIV infection by monitoring or inducing the accumulation of HIV-specific CD8<sup>+</sup> T cells in B cell follicles [17,28,29,34–36,51].

Another application combines IST with in situ hybridization (ISTH) to determine a direct spatial and temporal relationship between virus-specific CD8<sup>+</sup> T cells (effector cells) and virus-infected target cells. Looking at both the SIV infection of non-human primates (NHP) and LCMV infection in mice, it was determined that the location, timing, and abundance of antigen-specific T cells directly relates to the number of infected target cells [26,52].

In barrier tissues, like the lung, gut, and skin, CD4<sup>+</sup> and CD8<sup>+</sup> T cells take up residence following an infection. These tissue resident T cells (TRM), are the first line of defense against secondary pathogen exposure. IST staining has been used to increase our understanding of the immune response at these sites, including cell types required for the generation and maintenance of TRM. In a murine model of HSV-2 infection, IST staining was used to determine CD301b<sup>+</sup> dendritic cells (DCs) are critical for initiating and maintaining the CD8<sup>+</sup> TRM population in the female reproductive tract (FRT) [53]. They show that activation by CD301b<sup>+</sup> DCs activates CD8<sup>+</sup> TRM to produce interferon-gamma (IFN-γ), and this response is necessary for protection from HSV's challenge.

IST staining has also been used to visualize in situ dynamics of the immune response to *Listeria monocyotogenes* (LM) [54]. As early as day 3 post-infection, LM-specific CD8<sup>+</sup> T cells are detected in the spleen of infected mice and located at the border of the T and B cell zones. They also show an interaction of LM-specific CD8<sup>+</sup> T cells with CD11c<sup>+</sup> DCs in clustered foci within the T cell zones. Interestingly, after both influenza and LM infection, memory Ag-specific CD8<sup>+</sup> T cells can be found within B cell follicles. Upon a secondary challenge, there are many more foci of Ag-specific CD8<sup>+</sup> T cells within T cell zones, consistent with a robust CD8<sup>+</sup> memory T cell response to the secondary challenge.

Outside of infectious diseases, IST staining can be applied to other fields, including cancer biology and autoimmunity. In one study, IST was used to determine the feasibility of a dendritic cell-based vaccine for melanoma. In the three study participants examined, researchers detected tumor-specific CD8<sup>+</sup> T cells within the tumor following vaccination [42]. This has important implications for predicting the efficacy of cancer vaccines, as circulating T cell responses are not always a good indicator of protection. In another study, researchers investigated the CD8<sup>+</sup> T cell response in type 1 diabetes (T1D) [55]. Using IST, they described the first confirmation of Ag-specific, autoreactive CD8<sup>+</sup> T cells in the islet lesions from T1D patients. Interestingly, they also showed that recent onset patients (<1 year duration of the disease) had a more clonally restricted CD8<sup>+</sup> T cell response in the islets, whereas long standing patients (>1 year duration of the disease) had a more diverse CD8<sup>+</sup> T cell response [55]. Much like predicting vaccine efficacy, understanding the immune response at the site of autoimmunity, and not just in peripheral blood, is critical to improving future treatment of cancer and autoimmune diseases.

While the bulk of IST staining has been done using MHCI tetramers to study the CD8<sup>+</sup> T cell response, MHCII tetramers are available to investigate the CD4<sup>+</sup> T cell response to infections and autoimmunity. In one study examining patients with active *Mycobacteriaum tuberculosis* (*M. tb*) infection, IST staining showed Ag-specific CD4<sup>+</sup> T cells producing IFN-γ and TNF-α in the lymph nodes, lung granulomas, and cavernous tissue [12]. In experimental autoimmune encephalomyelitis (EAE), antigen-specific CD4<sup>+</sup> T cells were detected in the lymph nodes and central nervous system (CNS) of diseased animals, but not in uninfected animals [13].

#### **7. Limitations of IST Staining**

While there are many important applications for IST staining, this technique has significant limitations. Many of these limitations are a factor of tetramer technology. Unlike a peptide pool, used to broadly probe an antigen-specific T cell response, a tetramer can only have one peptide presented and, therefore, will only interact with T cells specific for that peptide. This can potentially cause researchers to underestimate the total immune response to a pathogen or vaccine, and similarly, can lead to over interpretation of results for that epitope. Most often, tetramers are made against the immunodominant epitope. The immunodominant epitope is the peptide that the majority of the CD4<sup>+</sup> or CD8<sup>+</sup> T cell response is generated against. However, there are often subdominant responses that might be overlooked with tetramer staining.

Because tetramer technology takes advantage of the specific interaction between the p-MHC complex and the TCR, use of the technology requires determining the MHC genotype of an individual prior to examining the Ag-specific response. There may be limitations in the availability of tetramers for MHC molecules encoded by a particular allele. Additionally, there is the possibility that the immune response you are visualizing cannot be generalized to other MHC molecules. For example, people expressing *HLA-B27* and *-B57* MHC molecules are more often elite controllers of HIV infection [56,57]. In rhesus macaques, *Mamu-A\*001:01* [58], *-B\*008:01 molecules* [59], and *-B\*017:01* [60] alleles are also associated with enhanced control of SIV infection. While understanding the immune response in elite controllers of HIV/SIV infection is valuable information, it is not generalizable to all CD8<sup>+</sup> T cell responses during HIV/SIV infection.

Another limitation that can be a problem for the in situ visualization of Ag-specific CD4<sup>+</sup> T cells is the affinity threshold. The affinity threshold required for staining of the p-MHC with tetramer is higher than that required for activation of TCR, which biases tetramer technology to detect primarily high affinity T cells [61]. Work in recent years has shed light on low affinity T cells contribute significantly to the immune response [62,63]. Though this has been described primarily for Ag-specific CD4<sup>+</sup> T cells, it can be found in Ag-specific CD8<sup>+</sup> T cells, as well [64]. Researchers have begun to address this limitation by increasing the number of p-MHC complexed in multimers. They have changed the scaffold from biotin to dextran which allows more p-MHC and more fluorescent molecules to bind, both of which help increase the detection of antigen-specific T cells [65].

One of the challenges unique to IST staining is that the best results are obtained using fresh tissue samples [18,21]. We showed that while fixed and frozen tissue can be used in IST, the best results are gathered from fresh, unfixed samples [10]. Nonetheless, sometimes experimental conditions do not allow for the use of fresh, unfixed samples.

We have not been successful at getting IST staining to work well and consistently in different experimental systems with fixed or frozen tissues. However, as mentioned above, others have demonstrated success in their experimental systems. Certainly, having a robust, reliable method to track and phenotype Ag-specific T cells in fixed and frozen tissues would be a great advantage to the scientific community. Advancements in existing IST staining methodologies, or the development of new methods, are warranted to achieve this goal. Future methods to detect Ag-specific T cells in fixed and frozen tissues may be on the horizon and may not rely on MHC-tetramers or multimers. For example, in situ hybridization methods that detect the unique hypervariable regions of TCR genes, termed complementarity-determining regions (CDRs), might be an effective means to track Ag-specific T cells in fixed and frozen sections. Indeed, Advanced Cell Diagnostics have recently developed an in situ hybridization method called BaseScope that may allow detection of CDRs.

In summary, IST combined with IHC has radically enhanced our understanding of the Ag-specific T cell response. Not only does it enable the determination of the magnitude and phenotype of Ag-specific CD4<sup>+</sup> and CD8<sup>+</sup> T-cell responses in situ, but it also is a critical tool in tracking their location within tissue compartments and cell–cell interactions. IST staining has been, and continues to be, used to enhance our understanding of the local cellular immune response in many areas of research, including cancer biology, vaccinology, viral pathogenesis, bacterial infection, and autoimmune diseases.

**Author Contributions:** H.M.A wrote the manuscript and made the figures; E.K.C. assisted with drafting the manuscript; P.J.S. obtained funding, helped in drafting the manuscript, and provided oversight.

**Funding:** This work was funded by National Institutes of Health, grant numbers 1R01 AI143380-01 and 1UM1AI26617 and the APC was funded by 1R01 AI143380-01 and 1UM1AI26617.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*International Journal of Molecular Sciences* Editorial Office E-mail: ijms@mdpi.com www.mdpi.com/journal/ijms

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18