3.5.1. *Pseudomonas stutzeri*

Analysis of the LPS of *Pseudomonas stutzeri* OX1 revealed a novel type of highly negatively charged LOS, containing two 4,6-linked pyruvate ketals linked to *N*-acetylglucosamine and glucose, independently [4]. The overall LOS structure has been determined to be 4,6(*S*)Pyr-β-Glc-(1→3)[4,6(S)Pyrβ-GlcNAc](1→4)GalNAc(1→3)-Hep7Cm2*P*4*P*-(1→3)-Hep2*P*4*P*-(1→5)[Kdo-(2→4)]Kdo-(2→6)-β-GlcN4*P*-(1→6)-GlcN1*P*, where *P* represents a phosphate group and Cm is carbamoyl. *P. stutzeri* OX1 was isolated from the activated sludge of a wastewater treatment plant, where unusual metabolic capabilities for the degradation of aromatic hydrocarbons were found. Pyruvate residues might be used to block elongation of the LPS chain to yield an LOS. This would lead to a less hydrophilic cellular surface, indicating an adaptive response of *P. stutzeri* OX1 to a hydrocarbon-containing environment [4].

#### 3.5.2. *Providencia alcalifaciens*

Bacteria of the genus *Providencia* are opportunistic human pathogens that cause intestinal and urinary tract infections. The O-antigen-based serological classification scheme of *Providencia alcalifaciens, Providencia rustigianii*, and *Providencia stuartii* includes 63 O-serogroups [141], most of which are acidic. Complex core structures have been elucidated in several *Providencia* O-serogroups [142,143].

*P. alcalifaciens* O19 differs in its O-PS from other *Providencia* strains. The O-PS repeat contains a 4,6-pyruvylated GlcNAc residue and has the complete structure →2)-β-Fuc3NAc4Ac-(1→3)-4,6(*S*)Pyr-α-GlcNAc-(1→4)-α-Gal-(1→4)-β-Gal-(1→3)-β-GlcNAc-(1→. In the NMR spectra of the oligosaccharide, signals of the methyl group of pyruvic acid have been observed and the presence of pyruvic acid in two thirds of the O-units in the polysaccharide has been proven [141].

#### 3.5.3. *Shigella dysenteriae*

*Shigella dysenteriae* is an aetiological agent of various intestinal disorders, including shigellosis. The strains of this bacterium are serologically heterogeneous because of the diversity of the structures of their O-antigens [144].

The structure of the *S. dysenteriae* type 10 O-antigen has been revised by Perepelov et al. in order to account for the acid-labile pyruvate modification that has been lost in a previous investigation due to acidic treatment of the sample. The full O-PS structure has been elucidated to be <sup>→</sup>2)-4,6(*S*)Pyr-β-d-Man*p*-(1→3)-α-d-Man*p*NAc-(1→3)-β-l-Rha*p*-(1→4)-α-d-Glc*p*NAc-(1<sup>→</sup> [145].

### 3.5.4. *Raoultella terrigena*

The enterobacterium *Raoultella terrigena* is another bacterium that carries a pyruvic acid modification on its O-PS β-Man residue, located at the O-4 and O-6 positions [146]. The structure of the repeating unit of the O-PS has been determined by means of chemical and spectroscopic methods and found to be a linear tetrasaccharide with the structure <sup>→</sup>2)-4,6(*S*)-Pyr-β-d-Man*p*-(1→3)-α-d-Man*p*NAc-(1→3)-β-l-Rha*p*-(1→4)-α-d-Glc*p*NAc-(1<sup>→</sup> [146], which is identical to that of *S. dysenteriae* type 10 [145].

#### 3.5.5. *Proteus mirabilis*

The O-PS repeating unit of *Proteus mirabilis* O16 has been established to be <sup>→</sup>3)-β-d-Glc*p*NAc-(1→3)-4,6-*R*-Pyr-α-d-Gal*p*NAc1→4)-α-d-Gal*p*A-(1→3)-α-l-Rha*p*2Ac-(1<sup>→</sup> [147]. This structure is significantly different from the O-PS structures of other *Proteus* spp. from the serogroup O19, such as *Proteus vulgaris, Proteus hauseri*, and *Proteus penneri* strains, and thus was a key for the reclassification of various *Proteus* strains.

#### 3.5.6. *Cobetia pacifica*

The O-PS form the LPS of *Cobetia pacifica* KMM 3878—an aquatic isolate form Japan—is composed of sulphated and pyruvylated trisaccharide repeats with the structure <sup>→</sup>4)-β-d-Gal-2,3-SO3H-(1→6)-β-d-Gal-3,4-*S*-Pyr-(1→6)-β-d-Gal-(1<sup>→</sup> [148].

#### *3.6. Mycobacterial Glycolipids with Pyruvate*

Mycobacteria contain a variety of glycolipids, including, among others, acylated glucose, acylated trehalose, sulfatides, mannophosphoinositides, and glycopeptidolipids [149]. A crude glycolipid fraction from *Mycobacterium smegmatis* ATCC 356 obtained by ethanolic extraction and silica gel chromatography revealed the presence of hitherto unknown anionic glycolipids [150]. The corresponding glycan moiety has the structure 4,6Pyr-3-*O*-Me-β-d-Glc*p*-(1→3)-4,6Pyr-β-d-Glc*p*-(1→4)-β-d-Glc*p*-(1→6)-β-d-Glc*p*-(1→1)-α-d-Glc.

Members of the *Mycobacterium avium–Mycobacterium intracellulare* (MAI) complex are typeable on the basis of their specific antigenic glycolipid. For instance, the dominant epitope of the MAI serovar 8-specific glycopeptidolipid is a terminal 4,6Pyr-*O*-Me-α-d-Glc*p* unit, whereas that of the MAI serovar 21 has the same terminal pyruvylated glucose devoid of the 3-methoxy group [151]. Healthy individuals of some populations are carriers of antibodies that are specific to these pyruvylated epitopes on the glycopeptidolipids. It is currently unclear, if the antibody reflects previous experience with one or both of these serovars or whether some other common cross-reacting pyruvylated environmental antigen is involved [151]. However, this finding might have protective implications against mycobacterioses and other infectious diseases.

#### *3.7. Pyruvylated Glycoconjugates in Eukaryotes*

#### 3.7.1. Eukaryotic Glycolipids

Information on pyruvylated glycoconjugates in eukaryotes is scarce in comparison to their description in bacteria. It is currently not clear whether this reflects the natural distribution of pyruvylation or if pyruvylation on eukaryotic glycoconjugates has escaped detection. Notably, pyruvylation has so far not been detected in humans.

An "exotic" example of a pyruvylated eukaryotic glycoconjugate is the phosphonoglyco-sphingolipid containing pyruvylated galactose in the nerve fibres of the sea hare *Aplysia kurodai*. The glycan structure of this phosphonoglycosphingolipid is 3,4Pyr-β-Gal-(1→3)-α-GalNAc-(1→3)-α-Fuc-(1→)-2-aminoethylphosphonyl-Fuc-(1→6)-β-Gal-(1→4)-Glc-(l→ [152].

#### 3.7.2. *N*-Linked Glycans in Yeast

Yeast species are known for the production of high- or oligo-mannosidic *N*-glycans that are displayed on various cell surface proteins [153]. In several yeast species (e.g., *Saccharomyces cerevisiae*, *Candida albicans*, *Pichia holstii*, and *Pichia pastoris*), phosphate groups or, to a lesser extent, sialic acids present on these extracellular glycans provide the necessary negative cell surface charge [153–155].

*S. pombe* is a notable example of a yeast whose net negative surface charge is neither conferred by phosphate nor by sialic acid. Instead, the *N*-linked galactomannans of *S. pombe* have pyruvylated β-Gal-(1→3)-(PvGal) caps on a portion of the α-Gal-(1→2)-residues in their outer *N*-glycan chains [156]. *S. pombe* lacks the ER Man9-α-mannosidase function as known from, for example, *Saccharomyces cerevisiae*. Therefore, it adds further mannose and galactose residues to the common *N*-glycan core structures, yielding galactomannans [157–159]. At least five different genes are required to synthesize the PvGal epitope. It is assumed that 4,6Pyr-β-Gal-(1→3) synthesis is carried out by a coordinated enzymatic system in which the β-Gal-(1→3) residues are first added to the *S. pombe* galactomannans and subsequently pyruvylated by the pyruvyltransferase Pvg1p [3] (see Section 5.1.3.). However, the complete mechanism for PvGal biosynthesis is currently unknown [3].

4,6Pyr-β-Gal is predicted to be the only contributor to the net negative cell surface charge of yeast, as disruption of the *pvg1*+ gene resulted in charge abolishment [160].

#### 3.7.3. Pyruvylated Galactans of Algae

Pyruvylated galactan sulphates are often found in red algal polysaccharides, which generally contain 3-substituted 4,6Pyr-d-Gal*p* residues. Among these galactans is that of *Palisada flagellifera*, which represents a highly complex structure with at least 18 different types of derivatives that are found mostly pyruvylated, 2-sulfated, and 6-methylated [161]. Another galactan is that of *Solieria chordalis*, the structure of which remains unknown but was shown to have high immunostimulating potential [162]. Other examples include the carragenans from Australian red algae of the family *Solieriaceae* [10] and galactans of the red seaweed *Cryptonemia crenulata* [13].

Examples of green algae include the highly pyruvylated and sulfated galactans from tropical green seaweeds of the order *Bryopsidalesor*, which have anticoagulant activity [11], such as that of *Codium divaricatum* with the structure Gal*p*-(4SO4)-(1→3)-Gal*p*-(1→3)-Gal*p*-(1→3)-Gal*p* and 3,4Pyr-Gal*p*-(6SO4)-(1→3)-Gal*p* [12].

#### 3.7.4. Pyruvylated Proteoglycan

The marine sponge *Microcionia prolifera* produces a pyruvylated adhesion proteoglycan with the structure 4,6Pyr-β-Gal-(l→4)-β-GlcNAc-(l→3)-Fuc-(1→ that is involved in species-specific cell re-aggregation [163].

#### **4. Methods for Research of Pyruvylated Glycoconjugates**

#### *4.1. Isolation of Pyruvylated Bacterial Glycoconjugates*

Several protocols for the isolation of glycoconjugates are in use; however, there is no specific general protocol for pyruvylated glycoconjugates. The procedures are strongly dependent on the source of the glycoconjugate—with a special emphasis on the cell wall architecture (i.e., Gram-positive versus Gram–negative bacteria)—and the class of glycoconjugate. Further, for each studied organism, the extraction protocol needs to be optimised. Because of the chemical nature of the acid-labile pyruvate entity, as the only commonality, for the isolation of pyruvylated glycoconjugates, acidic conditions should be avoided to prevent the loss of pyruvate [9,100,145].

For the extraction of EPS, for instance, the types of interactions by which the EPS matrix is created need to be taken into account, including variable extents of electrostatic interactions, van der Waal forces, hydrogen bonds, and hydrophobic interactions [164]. In most cases, physical forces are used to extract EPSs, such as centrifugation and filtration [28], stirring, pumping or shaking, heat treatment, or sonication [164]. Chemical steps include alkaline treatment with NaOH, addition of EDTA for removal of cations, addition of NaCl, use of ion exchange resins (e.g., Dowex), or enzymatic treatment [63,165]. If proteases are used for break-down of co-isolated proteins, an *O*-deacetylation step needs to be introduced to avoid the loss of putative acetyl groups on the EPS [55]. All mentioned chemical additives increase the solubility of the EPS in the aqueous phase; to solubilise EPS with hydrophobic portions, such as that from *Klebsiella pneumoniae*, detergents are necessary [166]. For the precipitation of EPS from the aqueous phase, ethanol is routinely used [65]. To enhance the EPS yield, often a combination of physical and chemical methods is applied [164].

For the extraction of CPS from Gram-negative bacteria, again, NaCl and EDTA are recommended [167]. Other protocols for the release of CPS are based on heat treatment of cells followed by precipitation of the CPS with acetone [80,90].

The isolation of SCWPs—classical and "non-classical" forms—is divided in two main steps: the purification of the peptidoglycan sacculus, which includes treatment of cells with heat, SDS, nuclease, and a protease such as trypsin, and extraction of SCWP by either ethanol precipitation for WTAs, or hydrofluoric acid treatment followed by ethanol precipitation for "non-classical" SCWPs [115,168,169]. The isolation of "non-classical" pyruvylated SCWP of *B. anthracis* was recently described in detail [169].

The extraction of LPS and other cell surface polysaccharides has been described previously [170, 171]. Prior to extraction of LPS, pelleted Gram-negative bacteria are usually depleted from CPS by aqueous washing [171]; most commonly, LPS is extracted [146,172,173] or, in the case of LOS, with phenol/chloroform/petrol ether (PCP) [174,175]. The crude extracts are subsequently de-*O*/*N*-acylated under mild acidic or basic conditions, with a preference for the latter. Further purification of the samples can be achieved by size exclusion and/or ion exchange chromatography [176].

#### *4.2. Pyruvate Analytics*

#### 4.2.1. Lectin Approach

Serum amyloid P component (SAP)—a normal plasma glycoprotein—has a Ca2+-dependent binding specificity for 4,6Pyr-*O*Me-β-d-Gal*p* (MOPDG) [177], and thus behaves like a lectin and may be a useful probe for this epitope as present in the cell walls of bacteria and other organisms [178]. SAP has been found to bind *in vitro* to *K. rhinoscleromatis* [89], the cell wall of which is known to contain this particular pyruvylated epitope. Binding was shown to be less pronounced to *X. campestris*, which contains a 4,6Pyr-Man*p* epitope [18], and no SAP bound to *E. coli*, which contains pyruvate 4,6-linked to glucose or to *S. pneumoniae* type 4, which contains pyruvate 2,3-linked to Gal*p* [74]. Binding of SAP to those organisms, which it did recognise, was completely inhibited or reversed by millimolar concentrations of free MOPDG.

#### 4.2.2. Biochemical Pyruvate Assays

A specifically developed colorimetric/fluorometric assay for ketal-pyruvate detection via enzymatic oxidation has been incorporated in a recently introduced high throughput screening platform for the structural analysis of novel EPS structures [179], which underlines the importance of pyruvylated epitopes. The platform is based on ultra-high performance liquid chromatography coupled with ultra-violet and electrospray ionization ion trap detection following EPS isolation.

A similar procedure for detection of free pyruvate is used in clinics. Pyruvate serves as an important metabolite in the citric acid cycle for the screening of liver diseases and genetic disorders in humans, as these are reflected by high pyruvate levels [180]. The procedure is based on the oxidation of pyruvate by pyruvate oxidase in the presence of acetyl phosphate, which leads to the production of CO2 and H2O2. The latter is detected via a fluorometric probe followed by a horseradish peroxidase reaction, which leads to the formation of resorufin. Colour development can be detected at 570 nm, and fluorescence at 530–540 nm for excitation and 585–595 nm for emission (Cayman pyruvate assay kit: https://www.caymanchem.com/pdfs/700470.pdf).

Other methods for pyruvate detection stem from food analytics, as pyruvate is involved in the degree of pungency of onions. Different methods are on the basis of the determination of total 2,4-dinitrophenylhydrazine-reacting carbonyls in a sample by photometric detection. Furthermore, oxidation of reduced diphosphopyridine nucleotide (DPNH) by pyruvate can be measured in a coupled reaction with lactic dehydrogenase. Decrease of the absorbance at 340 nm correlates with the oxidation of DPNH and, therefore, the concentration of pyruvate [181,182].

Assaying pyruvylation reactions of monosaccharides using HPLC-based approaches is dependent on the intended mode of detection. Frequently, specifically introduced saccharide modifications are used for detection purposes. One prominent example is the chemical attachment of *para*-nitrophenol (pNP) to the saccharides of interest for monitoring at 265 nm. To determine, for instance, the activity of the yeast pyruvyltransferase Pvg1p, the pyruvylated product species was separated from the unpyruvylated educt species using a COSMOSIL 5C18-P revered phase (RP) C18 column with 0.3% ammonium acetate, pH 7.4, containing 13% acetonitrile as a solvent. The pyruvylated product eluted from the column earlier than the educt, as monitored by recording the absorbance at 265 nm [160]. Another option is the use of a RP-C18 column in combination with a 1-propanol gradient in 88% 100 mM ammonium bicarbonate, accompanied by the detection of the nitrophenyl-modified sugar at an absorbance of 405 nm [7].

A more sophisticated fluorescent polyisoprenoid chemical probe—2-amideaniline-undP-PP-AAdGal-Gal, which equals an acceptor substrate mimic from the *B. fragilis* CPS A tetrasaccharide biosynthesis pathway—was established by Sharma et al. to monitor pyruvylation of the fluorescent lipid-linked substrate by the pyruvyltransferase WcfO directly by HPLC on a C18 column. An isocratic gradient of 35% 1-propanol with 65% 100 mM ammonium bicarbonate was used, and detection was done by fluorescence at excitation at 340 nm, and emission at 390 nm [7].

#### 4.2.3. NMR Analysis of Pyruvylation

Nuclear magnetic resonance (NMR) is a versatile tool for the non-invasive structure elucidation of bacterial polysaccharides, including substitutions such as pyruvic acid [183], which can be frequently found as 4,6-*O*, 3,4-*O*, or 2,3-*O* acetals. Systematic investigations of defined pyruvylated monosaccharides revealed stereospecific repeating patterns from which the absolute configuration of pyruvic acid acetals can be inferred [184]. It was shown that the 13C signal of an equatorial 4,6-pyruvate methyl group (Figure 4I,II) resonates at ~26 ppm, while the axial methyl group can be found at ~17 ppm. For 3,4 acetals, the 13C shifts have been studied in detail, and the difference between axial and equatorial methyl groups was found to be much smaller in comparison to 4,6. The 1H difference, however, is in this case more noticeable [185]. The ring form of the acetal being either 5- (for 3,4-O) or 6- (for 4,6-O) membered is reflected by 13C shifts [186]. It has also been shown that for most 4,6 acetals, the configuration of the methyl group is equatorial, which results in an *S* configuration for the d-*gluco*and d-*manno*-pyranosyls and an *R* configuration for the d-*galacto*-pyranosyls [187].

**Figure 4. I** and **II**, equatorial-oriented methyl groups of 4,6 *galacto*- and *manno*-pyranosyls. **III**, identification of the attachment site of pyruvylation via hetero multiple bond correlation (HMBC). **IV**, through-space correlation of the pyruvate methyl group to a ring proton—in this case H3. Pink arrows indicate though-bond interactions of neighbouring protons (HMBC), while blue arrows indicate through-space interactions of neighbouring protons (NOE).

In 1H NMR, the presence of pyruvic acid (4,6-, 3,4-, 2,3-) is usually indicated by a single prominent signal of the methyl group between 1.3–1.7 ppm, with a threefold higher relative intensity (peak area) in relation to another indicative signal such as the anomeric proton. For repeating units of polymers, the peak area of the methyl signal relative to another indicative signal reveals the degree of pyruvate substitution.

In 13C NMR, the presence of pyruvate substitution is usually indicated by the presence of signals for the pyruvic methyl group around 17–30 ppm (Figure 4, C3). The quaternary acetal carbon (Figure 4, C2) resonates in the anomeric region around 100 ppm (for 4,6-*O*) or 110 ppm (for 3,4-*O* and 2,3-*O*). Additionally, the quaternary signal of the carboxylic acid (Figure 4, C1) can be found between 170 and 180 ppm, with the 4,6-pyruvates present more towards 170 ppm and the 2,3 and 3,4 acetals found closer to 180 ppm [131].

The connectivity between the pyruvate and a saccharide is routinely determined by the employment of long-range 1H-13C correlation detection methods such as hetero multiple bond correlation (HMBC) experiments, which usually give correlation information over three and more bonds from the corresponding ring protons to the quaternary carbon of the acetal (Figure 4III). Therefore, 4,6-, 3,4-, or 2,3-pyruvic acetal identification is straightforward. The absolute configuration of the pyruvic acid acetal can be confirmed by through-space correlation experiments such as 1D or 2D NOESY (nuclear Overhauser and exchange spectroscopy), ROESY (rotating frame Overhauser

enhancement spectroscopy), or GOESY (gradient nuclear Overhauser and exchange spectroscopy) (Figure 4, IV) [74,146,188,189].

#### 4.2.4. MS analysis of Pyruvylation

Mass spectrometry in combination with NMR is a very powerful tool to determine the presence and position of pyruvylation in oligo- or polysaccharides. Common approaches are based on the break-up of polysaccharides by acid hydrolysis, methanolysis, and then either silylation [59,66] or acetylation [148], followed by gas chromatography (GC) or electrospray ionization-mass spectrometry (ESI-MS) analysis of the resulting monosaccharides [190]. Usually, characteristic patterns in the mass spectrum at the monosaccharide level are observed in the presence of pyruvates, such as a characteristic fragment ion at *m*/*z* 363 (M-COOMe) consistent with a molecular mass of 422, which would be indicative of a methyl *O*-(1-carboxyethylidene)hexopyranoside methyl ester di-O-trimethylsilyl-ether. Comparison of the methylation analysis on native and depyruvylated polysaccharides allows for the pinning down the initial position of the acetalic linkages. Methanolysis and reductive cleavage have been described for the analysis of pyruvate-containing polysaccharides [191]. General procedures for the MS analysis of oligosaccharides have been reviewed in detail elsewhere [192,193].

#### **5. Ketal-Pyruvyltransferases**

#### *5.1. Substrate Specificity of Ketal-Pyruvyltransferases*

The pyruvyltransferase CsaB from the SCWP biosynthesis pathways of *P. alvei* [109], the pyruvyltransferase WcfO from CPS A biosynthesis of *B. fragilis* [7], and the pyruvyltransferase Pvg1p from the *N*-glycan biosynthesis of *S. pombe* [161] are among the few studied enzyme orthologues.

#### 5.1.1. CsaB from *P. alvei*

*P. alvei* CsaB catalyses the pyruvate modification on a β-d-ManNAc residue present in every SCWP repeat; the resulting 4,6-β-d-ManNAc is the essential epitope for the binding of the bacterium's SLH domain-containing S-layer protein SpaA, as revealed from the co-crystal structure of synthetic pyruvylated ligand with truncated SpaASLH [104,194]. Supporting data comes from isothermal titration calorimetry, revealing binding between these modules only when the pyruvate entity was present [104].

Notably, a comparable mode of binding is elaborated between the S-layer protein Sap of *B. anthracis* and its pyruvylated SCWP [195]. However, poly-pyruvylation, as in the *P. alvei* SCWP, is not found in the *B. anthracis* SCWP, where only the β-d-ManNAc of the terminal repeat is modified [14].

For *B. anthracis*, a model for a Wzx/Wzy-dependent biosynthesis has been proposed, including cytoplasmic pyruvylation of the terminal repeating unit; however, the model is without any evidence of the nature of the acceptor substrate and biochemical proof of CsaB activity [113].

Concerning the pyruvyltransferase CsaB from *P. alvei*, in an *in vitro* enzyme assay, the necessity of a lipid-portion of the acceptor could be unambiguously demonstrated; neither free ManNAc nor a nucleoside-diphosphate-linked substrate was accepted as a substrate [109]. Using *P. alvei* TagA in combination with a synthetic 11-phenoxy-undecyl-PP-α-GlcNAc acceptor and UDP-ManNAc as substrate, it was demonstrated that TagA is an inverting UDP-α-d-ManNAc:GlcNAc-lipid carrier transferase of *P. alvei*. The produced 11-phenoxyundecyl-PP-α-d-GlcNAc-(1→4)-β-d-ManNAc compound was an acceptor substrate for 4,6-ketalpyruvyl transfer catalysed by recombinant *P. alvei* CsaB using PEP as a donor substrate [109], yielding a lipid-PP-linked pyruvylated disaccharide precursor (Figure 5).

Subsequent steps of SCWP biosynthesis in *P. alvei* remain elusive, as there is currently neither in silico nor experimental data available favouring a Wzx/Wzy- or ABC transporter-dependent pathway.

**Figure 5.** *In vitro* one-pot reaction involving the *Paenibacillus alvei* enzymes MnaA, TagA, and CsaB in combination with a synthetic lipid-PP-GlcNAc primer, demonstrating the preference of *P. alvei* CsaB for a lipid-PP-linked disaccharide substrate [109]. Pyruvylation (CsaB) is indicated by a star. RU: repeating unit. Monosaccharide symbols are shown according to the Symbol Nomenclature for Glycans (SNFG) [45].

#### 5.1.2. WcfO from *B. fragilis*

The necessity of a lipid-PP-bound substrate for pyruvyltransferase activity is supported by studies on the pyruvyltransferase WcfO from the *B. fragilis* CPS A biosynthesis; CPS A is composed of tetrasaccharide repeats containing an internal 4,6Pyr-Gal residue. On the basis of the stepwise enzymatic processing of an undp-PP-AAdGal*p* precursor *in vitro*, pyruvylation by WcfO was predicted to occur in the cytoplasm at the stage of the lipid-linked CPS A repeat unit precursor undp-PP-AAdGal*p*-Gal before completion of the tetrasaccharide repeat and completion of the CPS A in the periplasm [7,93] (Figure 3). Importantly, WcfO was inactive on UDP-galactose or pNP-galactose, supporting the requirement of a lipid-P carrier for pyruvyltransferase activity of WcfO [7].

Enzymatic transfer of pyruvate onto lipid-bound sugar intermediates has also been previously described in CPS biosynthesis of *Rhizobium trifolii* [196,197] and in xanthan biosynthesis of *Xanthomonas campestris* [31].

#### 5.1.3. Pvg1p from *S. pombe*

In contrast, the third functionally characterized 4,6-ketal-pyruvyltransferase, Pvg1p from *S. pombe*, was proven *in vitro* with both pNP-β-Gal and pNP-β-lactose serving as suitable acceptor substrates [160]. According to studies of the pyruvylation mechanism, Pvg1p resides in the membrane of the Golgi apparatus where it adds the pyruvate moiety to the Gal caps of its *N*-glycans. For this purpose, PEP is transported by two transporters, Pet1p and Pet2p, into the lumen of the Golgi apparatus where it serves as a donor substrate for the pyruvylation reaction [160].

A recent study has determined the crystal structure of the Pvg1p enzyme [160]. Pvg1p consists of 12 α-helices and 12 β-sheets, with 2 α/β/α domains at the N- and C-terminal half regions. Charged surface representation analysis revealed a positively charged cleft situated between the N- and C-terminal halves of Pvg1p, which suggests a possible mode of binding that may accommodate the negatively charged PEP donor substrate. Since neither PEP- nor pNP-β-Gal-co-crystal structures with the enzyme could be obtained, the empty substrate-binding cleft was used as a scaffold for computational substrate modelling using PEP [198]. In the proposed computational model, residues R217, R337, L338, and H339 form direct hydrogen bond contacts with PEP. Residues L338, H339, and D240 also appear to function in maintaining the shape of the PEP-binding pocket via a set of specific interactions. The crystallization study indicated that the pyruvylation process mimics sialyation; interestingly, Pvg1p shows resistance to sialidase digestion. Thus, a better characterization of the effects of pyruvylation might facilitate the development of pharmaceutical glycoproteins [198]. From the same research group, an enzyme was characterised as a 4,6Pyr-β-d-Gal-releasing enzyme (PyrGal-ase) with specificity for the (1→3) yeast linkage; mammalian (1→4)-linked PyrGal could not be hydrolysed. The physiological role of the PyrGal-ase in the *Bacillus* strain from where it was isolated is currently unknown [199].

Except for the three characterized pyruvyltransferases, no data on neither the activity nor the substrate specificity of pyruvyltransferases is available in the literature. This is surprising, considering that pyruvylation on glycoconjugates is widely distributed in nature. Future research on ketal-pyruvyltransferases should be directed towards mechanistic investigations of the enzyme's mode of catalysis, as well as inhibitor screening, similar to that of the enol-pyruvyltransferase MurA, which is a prominent target of antibiotics.

#### *5.2. Challenges in Research of Ketal-Pyruvyltransferases*

Currently, no definite classification of pyruvyltransferases is possible, although orthologous enzymes are predicted in various organisms. The Carbohydrate-Active enZYme (CAZy) database (http://www.cazy.org/) reveals, for instance, putative polysaccharide pyruvyltransferases from *Clostridium stercorarium* subsp. *stercorarium* DSM 8532 and *Clostridium thermosuccinogenes* DSM 5807 belonging to the glycosyltransferase 4 (GT4) family, with a classification as retaining GT type B fold-like glycosyltransferases.

The reasons for the limited number of characterised pyruvyltransferases are due to the challenges faced with the set-up of *in vitro* enzyme assays. While commercially available PEP has been proven to be a suitable donor substrate for the transfer of the pyruvyl moiety in distinct cases [7,109,195], the availability of suitable acceptor substrates is a limiting factor. Free saccharides have not been recognised as acceptor substrates by the pyruvyltransferases investigated so far [7,109]. According to our current knowledge, these enzymes instead require more elaborate intermediates from the pyruvylated glycoconjugate's biosynthesis pathway. Depending on the glycoconjugate structure and its mode of biosynthesis—which might be an *en bloc* (involving an ABC transporter) or sequential synthesis (involving a Wzx flippase and a Wzy polymerase) according to the terminology introduced for LPS biosynthesis routes [97,98,138,200], yielding pyruvylation as either a pre- or post-polymerization modification—di-, tri-, or even oligosaccharide repeating units might be required. Furthermore, most glycoconjugates are biosynthesized on a membrane-embedded lipid carrier, such as undp-P or diacylglycerol. Such lipid-linked glycan precursors usually cannot be purified from the natural source in sufficient quantity and purity because of the high turnover rates and efficient recycling pathways of these lipid carriers, which are shared between several cellular glycoconjugate biosynthesis pathways, including that of peptidoglycan [201].

Thus, complex saccharide acceptor substrates are required, which are not commercially available. These compounds need to be produced along sophisticated and laborious chemical synthesis schemes, which also need to account for a lipophilic portion, either in the form of the native lipid carrier or a simplified mimic thereof.

For identifying a suitable acceptor substrate for an *in vitro* pyruvyltransferase assay, a delicate balance between the best possible acceptor mimic and solubility needs to be found in order to enable subsequent analytical procedures. To overcome all these challenges, the development of novel chemical, enzymatic, or chemo-enzymatic synthesis strategies for acceptor substrate production is a current major focus in pyruvyltransferase research.

#### *5.3. Sequence Space of Ketal-Pyruvyltransferases*

This review aimed at exploring the currently known sequence variation (extant sequence space) of pyruvyltransferases and their taxonomic distribution on the basis of the three functionally characterized sequences—*P. alvei* CsaB (K4ZGN3), *S. pombe* Pvg1p (Q9UT27), and *B. fragilis* WcfO (Q5LFK7).

#### 5.3.1. Methods

The best 50 sequence hits of BLAST searches with K4ZGN3\_CsaB, Q9UT27\_Pvg1p, and Q5LFK7\_WcfO were aligned with MAFFT using the algorithm FFT-NS-2. The three alignments were then used as queries for hmmsearch [202] on the UniProtKB database, setting significant E-values for sequences <9.0 <sup>×</sup> 10−<sup>30</sup> and for hits <9.9 <sup>×</sup> 10−30. Results were restricted to hits showing a pyruvyltransferase domain (PS\_pyruv\_trans domain; Pfam: PF04230). The resulting three sequence selections were filtered for incomplete sequences and annotated according to their taxonomy using the online tool SeqScrub [203]. The sequences were further submitted to the Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) [204] with an initial BLAST E-value of 1 <sup>×</sup> <sup>10</sup>−<sup>5</sup> to calculate sequence similarity networks (SSNs). Sequences were restricted to a length between 250 and 600 amino acids, and the calculated networks were displayed at an alignment score cut-off of 1 <sup>×</sup> <sup>10</sup><sup>−</sup>50.

### 5.3.2. Results

Three independent database searches based on the biochemically characterized pyruvyltransferases CsaB, Pvg1p, and WcfO resulted in three sequence selections of 2053, 1019, and 233 sequences, respectively. When comparing these selections, it was found that they did not share any protein sequences, implicating that the sequence space covered by these searches does not overlap. It is, therefore, conceivable to assume that more pyruvyltransferase sequences and organisms harbouring a pyruvyltransferase gene exist, which are not covered in this study. Additionally, the comparison shows that there are at least three different types of pyruvyltransferases that do not share a close sequence relationship. Judging from the number of sequences in the selections and the extent of their taxonomic distribution, pyruvyltransferases from the SSN of CsaB (CsaB-like), and pyruvyltransferases from the SSN of Pvg1P (Pvg1P-like) seem to be the most common types of pyruvyltransferases, while WcfO-like pyruvyltransferases might be more of a specialized type of pyruvyltransferase.

Looking at the taxonomic distribution of these types of pyruvyltransferases in the SSNs (Figure 6), it can be seen that CsaB-like pyruvyltransferases occurred mainly in the phyla of *Firmicutes* and *Cyanobacteria*; Pvg1p-like pyruvyltransferases occurred mainly in the phyla of *Proteobacteria* and *Firmicutes*; and WcfO-like pyruvyltransferases occurred almost exclusively in the phyla of *Bacteroidetes, Proteobacteria*, and *Firmicutes*. In most cases, these different phyla separated nicely into different clades. For the SSN of Pvg1p-like pyruvyltransferases, however, there were two clusters where *Proteobacteria* were heavily mixed with *Firmicutes* and *Bacteroidetes*, and in the SSN of WcfO-like pyruvyltransferases, *Proteobacteria* were found to be heavily mixed with *Bacteroidetes*. Such mixed sequence populations might occur because of the high rates of lateral gene transfer among *Proteobacteria* [173]. Looking across all three SSNs, there was typically only one major cluster for each phylum. The only exception was pyruvyltransferase sequences from *Firmicutes*, which showed multiple big clusters in all three SSNs, indicating that *Firmicutes* might carry multiple types of pyruvyltransferases.

**Figure 6.** Sequence similarity networks illustrating the extant sequence space around *Paenibacillus alvei* CsaB (K4ZGN3), *Schizosaccharomyces pombe* Pvg1p (Q9UT27), and *Bacteroides fragilis* WcfO (Q5LFK7). The three characterized sequences—CsaB, Pvg1p, and WcfO—are highlighted by yellow circles.

Analysing the functionally characterized pyruvyltransferase CsaB in the context of its surrounding sequence space showed the enzyme to be a typical representative of the biggest cluster (*Firmicutes*) in the CsaB-like SSN. The same goes for WcfO, which was also found within the biggest cluster (*Bacteroidetes* and *Proteobacteria*) of the WcfO-like SSN. Pvg1p, on the other hand, was found at the border of a minor *Ascomycota* clade in the Pvg1p-like network and, therefore, cannot be considered a typical representative of this network. It is interesting to note, however, that Pvg1p is a pyruvyltransferase from the fungal phylum of *Ascomycota*, but the sequence search based on Pvg1p resulted mainly in bacterial sequences from *Proteobacteria* and *Firmicutes*, rather than other fungal sequences.

In addition to CsaB, Pvg1p, and WcfO, this review further discussed 48 putative pyruvyltransferases, and about half of their corresponding amino acid sequences were present within the calculated SSNs. Possible reasons for this incomplete recovery of sequences in the SSNs were the lack of genome sequencing data, missing or faulty taxonomic annotation of sequences, and the fragmentary coverage of the pyruvyltransferase sequence space in the performed SSN analysis.

Note that this study refrained from removing sequences showing 100% sequence identities (possible duplicates), meaning that the utilized datasets included all currently known pyruvyl-transferase entries found under the given search parameters on UniProtKB. It is inevitable that this leads to a possible bias in sequence counts towards organisms that are more heavily sequenced than others, but at the same time, it guarantees the representation of the full taxonomic distribution of pyruvyltransferases. From these datasets, the phyla *Firmicutes*, *Proteobacteria*, *Cyanobacteria*, and *Bacteroidetes* were found to be the phyla where pyruvyltransferases are most common.

From this study, it is evident that the pyruvyltransferase sequences available in public databases are extremely diverse, and without the availability of further biochemically characterized pyruvyltransferases, predictions of pyruvyltransferases based on amino acid sequences have to be interpreted with care.

#### **6. Discussion**

Pyruvyltransferases are a widespread but little investigated class of carbohydrate-active enzymes, which transfer a pyruvate moiety from a PEP donor to various monosaccharide targets (Table 1). This leads to a wealth of glycoconjugates carrying this modification. Pyruvylation can be found in almost all classes of glycoconjugates—including EPS, CPS, CA, LPS, LOS, SCWP, and *N*-glycans—occurring in bacteria, algae, and yeast, but not in humans. Importantly, pyruvylation imparts an anionic character to the glycoconjugates, which is pivotal to many biological functions. Described functions include the influence on the viscosity of the EPS, bacterial symbiosis with plants [18,28,46], immunostimulatory effects (mostly of CPSs [7,93]), employment of sialylation-like properties in human-type oligosaccharides [198], and cell wall anchoring relying on the Pyr-β-d-ManNAc epitope [14,99,104,195], to name a few. However, learning more about the biological significance of pyruvylated glycoconjugates and delineating a possible association between the position of pyruvylation and functionality are remaining challenges for future research.

Regrettably, for most of the described pyruvylated glycoconjugates, the genetic determinants of the modification are unknown because of missing genome sequencing data of the respective organisms. Given the widespread occurrence and the importance of sugar pyruvylation in nature, there is a high interest in the research community to identify pyruvyltransferases and gain insight into the mechanism of pyruvylation, especially with regard to the high potential to reveal novel functions and drug target points.





Up until now, three orthologous pyruvyltransferases have been biochemically investigated [7,111,161]. However, they do not show any close sequence relationship (compare with Figure 6). This finding might point towards a convergent evolution of pyruvyltransferases or a very high evolutionary rate that underlines the high sequence variability present in this enzyme class. The SSNs established within the frame of this review indicate that the described sequence space around the three hitherto characterized sequences was not sufficient to cover the whole extent of sequence variation of pyruvyltransferases. Based on the currently available sequence information, pyruvyltransferases mainly occur in bacterial phyla of *Firmicutes*, *Proteobacteria*, *Cyanobacteria*, and *Bacteroidetes*, and to a lesser extent in eukaryotic species.

Given the relentless spread of antibiotic-resistant organisms, new chemotherapeutic strategies to overcome infections could be based on intervening in the mechanisms of pyruvylation, an enzymatic modification detected in almost all classes of cell envelope glycoconjugates.

**Author Contributions:** F.F.H. and C.S. (Cordula Stefanovi´c) wrote the initial manuscript; L.S. conducted and edited the sequence space part (SSNs) of the manuscript; M.B. contributed to the pyruvate analytics part of the manuscript; C.S. (Christina Schäffer) wrote, revised, and edited the manuscript. All authors approved the final version of the manuscript.

**Funding:** This research was funded by the Austrian Science Fund FWF, projects P27374-B22 and P31521-B22 (to C.Sch.); the Hochschuljubiläumsstiftung der Stadt Wien, project H-318348/2018 (to F.F.H.), and the PhD Program "Biomolecular Technology of Proteins", FWF project W1224.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
