*3.8. Detection of Post-Translational Modifications*

The PTMs of proteins influence their activity, structure, turnover, localization, and capacity to interact with other proteins. In *Leishmania*, PTMs, together with mRNA stability and translation processes, are the essential regulators of gene expression. In this study, based on MS/MS spectra, a significant number of phosphorylated, methylated, acetylated, glycosylated, and/or formylated proteins were identified in the *L. infantum* proteome. Thus, even though specific enrichments of modified proteins were not performed, we identified modified peptides that accounted for 10 phosphorylated, 144 methylated, 192 acetylated, 28 formylated, and 3 glycosylated proteins.

The phosphorylation of serine (S), threonine (T), and tyrosine (Y) amino acids implies an increase of 79.97 Da in their molecular weights (unimod.org). The phosphorylated proteins identified in this study and the modified residues are listed in Table 6. Apart from two unknown phosphoproteins (LINF\_040005600 and LINF\_220013200), the ribosomal protein S10, α tubulins, an rRNA biogenesis protein-like protein, the 3-ketoacyl CoA-thiolase, the flagellar protein KHARON1, the glycogen synthase kinase 3 (GSK-3), and the prototypical HSP70 might be regulated by phosphorylation (Table 6). In fact, the phosphorylation of HSP70 has been reported to occur during the stress response in both promastigotes and amastigotes of *L. donovani* [65].


**Table 6.** Phosphoproteins in the *L. infantum* proteome.

Most of the observed phosphorylation events occurred on S and T, but in some proteins, phosphorylation on Y residues was also detected. Rosenzweig et al. [66] identified 16 phosphorylated proteins in *L. donovani* in either promastigotes or amastigotes; in that study, all of phosphorylations occurred at S or/and T residues. No coincidences exist between the phosphoproteins identified by these authors and those identified in this study (Table 6); however, this was not unexpected because phosphorylation is a dynamic modification and the numbers of phosphorylated proteins identified in both studies were low. Though the occurrence of phosphorylation events has been proven to be much lower in tyrosine residues [67], it is remarkable that phosphorylated tyrosines were found in α tubulins, the rRNA biogenesis protein, and in GSK-3 (Table 6). Some of these phosphorylated proteins have also been identified in previous studies. Thus, for instance, the 3-ketoacyl-CoA thiolase was found to be phosphorylated (at serine 229) in *L. donovani* promastigotes [68]. Kinases and phosphatases are enzymes implicated in the regulation of phosphorylation/dephosphorylation processes, and, accordingly, several serine, threonine, and tyrosine kinases and phosphatases have been identified in the *L. infantum* proteome (Supplementary File, Table S11).

Methylation (+14 Da) is a physiological PTM that occurs at the C- and N-terminal ends of proteins, and on the side chain nitrogen of arginine (R) and lysine (K); this modification is critical for regulating several cellular processes. Apart from those amino acids, methylations have been found to occur in other amino acids like aspartic acid (D), glutamic acid (E), histidine (H), glutamine (Q), and asparagine (N) [69]. In the *L. infantum* proteome, 139 proteins were predicted to be methylated, 76 of them showed methylation at K or R residues, 123 of the modified proteins showed D and/or E methylated residues, and a methylated-H was found in β-tubulin. All the methylated proteins detected in this study are listed in Supplementary File, Table S12. In summary, our findings pointed out that methylation at D and E residues would be relatively frequent in *Leishmania*; as suggested by Sprung et al. [69], methylations at E and D residues would increase the hydrophobicity of the modified proteins. Some examples of highly methylated proteins identified in this study are α and β tubulins, heat shock proteins HSP70 and HSP83/90, and the elongation factor 2 (eEF2). Many orthologs to the methylated proteins detected in this study were also identified as methylated in *L. donovani* [66].

Acetylation (+42.02 Da) is a PTM considered as relevant as phosphorylation in metabolic and signaling pathways. K acetylation has been described as a reversible enzymatic reaction that regulates protein function, as it is particularly relevant in chromatin compaction by the acetylation of histones [70]. Interestingly, the accumulation of acetylated histones has been observed at the polycistronic transcription initiation sites in *L. major* and *Trypanosoma cruzi* [71,72]. However, in the *L. infantum* proteomics data, peptides bearing acetylated K belonging to histones were not identified, as was the case in the *L. donovani* proteome [66], thus suggesting a relatively low proportion of acetylated histones in the bulk of total cellular histones. Some examples of proteins detected as acetylated in this study are β tubulins (LINF\_330015100, LINF\_330015200, and LINF\_210028500; modified at K297), guanylate kinase (LINF\_330018400, K3), the subunit β of ATP synthase (LINF\_250018000; K511) and a calpain-like cysteine peptidase (LINF\_140014400; K74).

In addition, acetylation at the N-terminal ends of proteins may occur either co- or post-translationally, as it is a frequent modification in eukaryotic proteins even though their physiological consequences remain poorly understood [73]. This irreversible modification affects protein fate in the cell and is carried out by N-terminal acetyltransferases (Nat). In the *L. infantum* promastigote proteome, we identified three of these enzymes: Nat-1, Nat-B, and Nat-C (Supplementary File, Table S11). On the other hand, among the 144 N-terminally acetylated proteins identified in this study (Supplementary File, Table S13), 40 proteins showed acetylation at their initial methionine (iM), and 104 were acetylated at the second amino acid, suggesting a cleavage of the iM during protein maturation [74]. In the cases in which acetylation takes place at the iM, we detected a bias in the amino acids located behind the iM. Thus, in 40% of the acetylated proteins, the second amino acid was the polar non-charged N or Q residues (Figure 5A also shows the other more frequent amino acids located at the second position). An acetylation reaction after iM removal was mainly found on S (55% of the cases) and alanine (A) (in 31%) residues (Figure 5B). These two amino acids, as well as threonine (found in 7.7% of the detected acetylated residues), have small side chains, a feature previously noted to favor a more efficient iM cleavage in the course of protein maturation [66,75].

**Figure 5.** Features of the N-terminal acetylated proteins identified in this study. (**A**) Frequencies (percentages) of amino acids found next to the acetylated initial methionine (iM) and the putative enzymes responsible for the acetylation. (**B**) Percentages of amino acids found to be acetylated after cleavage of the iM and the putative enzyme involved in the reaction. (**C**) An example illustrating the usefulness of proteomics data for improving gene annotations. An acetylated peptide (red circle on the grey shaded sequence) was mapped to sequences located upstream of the currently annotated coding sequence for gene LINF\_010008200 (blue box). The corrected gene (pink box) fit well into the transcript (green box). The image in (**C**) was generated using the Integrative Genome Viewer (IGV). CDS: coding sequence.

On the other hand, the analysis of acetylated peptides allowed us to correct the initiator AUG codon (and, therefore, the predicted amino acid sequence) of four previously misannotated genes (whose new coordinates are indicated in Supplementary File, Table S13). One example is illustrated in Figure 5C; the coding sequence of the LINF\_010008200 gene (coding for a poly (A) export protein) should be extended upstream 36 triplets based on the existence of an acetylated peptide encoded in that region.

Glycosylation also plays a relevant role in protein maturation, as well as in signal transduction mechanisms [76]. In this *L. infantum* proteomic study, we detected hexosylations or N-acetylhexosamine addition in three proteins; these modifications were consistent with N-linked glycosylation events at N residues. These PTM-modified proteins are cysteine peptidase A (CPA; LINF\_190020600) modified at N345, 3-hydroxy-3 methylglutaryl CoA synthase (at N340; LINF\_240027300), and PUF6 (at N483; LINF\_330019100). The characterization of glycoproteins and the nature of their modifications remain challenging due to the complexity and variety of glycan moieties. In this study, two single modifications (hexosylation and N-acetylhexosamine addition) were searched, and this explained the extremely low number of detected glycosylated proteins. Rosenzweig et al. [66] found 13 glycosylated proteins—only one in asparagine and the rest in serine and threonine residues.

Formylation has not been previously described in trypanosomatids, but N-terminal methionine formylation in eukaryotes has been linked to cellular stress and protein degradation processes [77]. In particular, formyl-lysine residues has been found in histones and other nuclear proteins [78]. In our study, eight proteins were detected as formylated, mainly at leucine (in β-tubulin and typaredoxin), glycine (in a hypothetical protein; LINF\_260024300), serine (in an RNA-guanylyltransferase) and lysine (in HSP83/90, a paraflagellar rod protein and a transaldolase) residues (see Supplementary File, Table S14). Future research on protein formylations in *Leishmania* and other organisms will provide insight into the physiological significance of this kind of PTM.
