*2.4. Nomenclature*

To facilitate data interpretation, the following abbreviations were adopted. The abbreviation of the long-chain fatty acid such as the stearic acid attached to the C2 position of the trehalose backbone is designated as 18:0. The multiple methyl-branched mycolipenic acid, for example, the 2,4,6-trimethyl-2-tetracosenoic acid attached to the C3-position is designated as 27:1 to reflect the fact that the acid contains a C27 acyl chain with one double bond. Therefore, the DAT species consisting of 18:0- and 27:1-FA substituents at C2-, and C3-position, respectively, is designated as 18:0/27:1-DAT.

### **3. Results and Discussion**

DAT formed [M + Alk]+ ions (Alk = NH4, Li, Na, etc.) in the positive ion mode; and [M + X]− (X = Cl, RCO2; R = H, CH3, C2H5, etc.) ions in the negative-ion mode when subjected to ESI in the presence of Alk+ and X<sup>−</sup>. For example, when dissolved in CH3OH with the presence of HCO2Na, adduct ions in the fashions of [M + Na]+ (Figure 1) in the positive ion mode and [M + HCO2] − ions (data not shown) in the negative ion mode were observed. The formation of these adduct ions was revealed by the elemental composition of the molecular species deduced by high resolution mass spectrometry (Table 1). Upon being subject to CID in a linear ion-trap, the MSn (n = 2,3,4) spectra of both the [M + Na]+ and [M + HCO2] − ions contain rich structural information readily applicable for structural identification.


**Table 1.** The high resolution mass measurements of the [M + Na]+ ions of DATs isolated from *M. tuberculosis* and the assigned structues (\* structure not defined).

**Figure 1.** The positive-ion ESI mass spectrum of the [M + Na]+ ions of the DAT species isolated from cell envelope of M. tuberculosis.

#### *3.1. The Fragmentation Processes of the [M + Na]+ Ions of DAT Revealed by LIT MSn*

The utility in the performance of sequential precursor ion separation, CID, and acquiring MSn spectra of LIT MSn mass spectrometry permits insight into not only the fragmentation processes, but also the structural details of the molecules. For example, the LIT MS<sup>2</sup> spectrum of the [M + Na]+ ions of 18:0/27:1-DAT at m/z 1021 contained the dominated ions of m/z 859 (Figure 2a) arising from loss of glucose residue, along with the ion set at m/z 737 and 613, arising from losses of 18:0-, and 27:1-fatty acid substituents, respectively (Scheme 1). The ions of m/z 859 represent the sodiated diacylglucose with both the 18:0-, and 27:1-fatty acyl substituents. This notion is further supported by the MS<sup>3</sup> spectrum of the ions of m/z 859 (1021 → 859, Figure 2b) which contained ions of m/z 575 (859 – 284) and 451 (859 – 408), arising from losses of 18:0-, and 27:1-fatty acid substituents, respectively. The results also sugges<sup>t</sup> that the Na+ charge site most likely resides at the glucose ring with the two acyl groups (Glc 1). In contrast, the MS<sup>2</sup> spectrum of the [M + Na]+ ions of the 6,6-dioleoyltrehalose standard [30] at m/z 893 forms abundant ions at m/z 467, representing the sodiated oleoylglucose (data not shown), consistent with the fact that the 18:1 fatty acyl substituents in the 18:1/18:1-DAT are located on the separate Glc (i.e., Glc1 and Glc 2). Further dissociation of the ions of m/z 737 (1013 → 737, Figure 2c) gave rise to the prominent ions of m/z 329 by loss of 27:1-fatty acid substituent, and m/z 575, arising from loss of glucose residue (Glc 2), together with m/z 431 representing a sodiated ion of 27:1-FA. These results further support the fragmentation processes as proposed in Scheme 1.

**Figure 2.** The LIT MS<sup>2</sup> spectrum of the [M + Na]+ ion of 18:0/27:1-DAT at m/z 1021 (**a**), its MS<sup>3</sup> spectra of the ions of m/z 859 (1021 → 859) (**b**), and of m/z 737 (1021 → 737) (**c**); and its MS<sup>4</sup> spectra of the ions of m/z 575 (1021 → 859 → 575) (**d**), and of m/z 451 (1021 → 859 → 451) (**e**).

**Scheme 1.** The structure of [M + Na]+ ion of 18:0/27:1-DAT at m/z 1021 and proposed LIT MSn fragmentation processes\*. \* All the ions represent the sodiated species. To simplify, the drawing of "Na+" is omitted from the scheme.

The formation of the ions of m/z 575 from m/z 859 by loss of 18:0-FA residue at C2 may involve the participation of the hydrogen atom at C1 to form an enol, which undergoes enol-keto tautomerism to yield a stable sodiated ion of monoacyl (27:1) glucose as the keto form (Scheme 1). This fragmentation processes are further supported by MS<sup>4</sup> on the ions of m/z 575 (1013 → 859 → 575, Figure 2d), which yielded ions of m/z 545, 515, and 475, likely arising from the across cleavages of the glucose ring, suggesting that the 27:1-fatty acyl substituent is located at C3 (Scheme 1).

Similarly, MS<sup>4</sup> on the ion of m/z 451 (1013 → 859 → 451, Figure 2e) gave rise to ions of m/z 421, 391, and 361 arising from the similar rupture of the glucose ring, indicating that the 18:0-fatty acyl substituent is most likely located at C2 of the glucose ring. The preliminary loss of the 27:1-FA substituent may involve the participation of the adjacent hydrogen at C4 of Glc 1 to form an enol, which sequentially rearranges to keto form via the similar enol-keto tautomerism mechanism.

The preferential formation of the ions of m/z 575 from loss of the 18:0-FA substituent over the ions of m/z 451 from similar loss of the 27:1-FA as seen in Figure 2a is readily applicable for locating the FA substituents on the trehalose backbone.

#### *3.2. LIT MSn on the [M + Na]+ Ions of DAT for Stereoisomer Recognition*

To define the structures of DAT species with many isomeric structures using LIT MSn is exemplified by characterization of the [M + Na]+ ions of m/z 979, which gave rise to the prominent ions at m/z 817 (Figure 3a) arising from loss of glucose. Further dissociation of the ions of m/z 817 (979 → 817; Figure 3b) yielded the ion pairs of m/z 533/451, arising from losses of 18:0/24:1 fatty acid substituents, indicating that the these two acyl groups are situated at Glc 1, giving assignment of the 18:0/24:1-DAT structure. The spectrum also contained the m/z 535/449, 561/423, 547/437, 563/421, 575/409 ion pairs, arising from losses of 18:1/24:0, 16:0/26:1, 17:0/25:1, 16:1/26:0, and 15:0/27:1 FA pairs, respectively. The results indicate the presence of the 18:1/24:0-, 16:0/26:1-, 17:0/25:1-, 16:1/26:0-, and 15:0/27:1-DAT isomers. The above structure assignments were further confirmed by the MS<sup>3</sup> and MS<sup>4</sup> spectra. For example, the MS<sup>3</sup> spectrum of the ions of m/z 697 (979 → 697; Figure 3c) from primary loss of a 18:1-FA residue at C2 (Figure 3a) gave the abundant ions of m/z 535 (loss of Glc 2), along with ions of m/z 329 arising from loss of 24:0-FA substituent, and of m/z 391, representing a sodiated 24:0-FA cation. The results confirm the presence of 18:1/24:0-DAT. The MS<sup>4</sup> spectrum of the ions of m/z 535 (979 → 817 → 535; Figure 3d) contained ions of m/z 517 (loss of H2O) and 505, 475, 445 arising from cleavages of the sugar ring similar to that shown in Scheme 1, along with ions of m/z 167 from loss of 24:0-FA, pointing to notion that the 24:0-FA is located at C3.

The MS<sup>3</sup> spectrum of the ions of m/z 695 (979 → 695; Figure 3e) contained the ions of m/z 533 and 329 arising from further losses of Glc and 24:1-FA residues (Figure 3a), respectively. The MS<sup>4</sup> spectrum of the ions of m/z 533 (979 → 817 → 533; data not shown) gave ions of m/z 503, 473, and 443 from the similar fragmentation processes that cleave the sugar ring (Scheme 1), confirming that the 24:1-FA substituent is located at C3. The results led to assign the 18:0/24:1-DAT structure. Using this LIT MSn approach, a total of six isomeric structures were identified.

#### *3.3. The Fragmentation Processes of the [M + HCO2]* − *Ions of DAT Revealed by LIT MSn*

In the negative-ion mode in the presence of HCO2 −, 18:0/27:1-DAT formed [M + HCO2] − ions of m/z 1043, which gave rise to the prominent ions of m/z 997 by loss of HCO2H, along with ions of m/z 713 and 589 by further losses of 18:0- and 27:1-FA substituents, respectively (Figure 4a) (Scheme 2). This fragmentation process is further supported by the MS<sup>3</sup> spectrum of the ions of m/z 997 (1043 → 997, Figure 4b), which are equivalent to the [M − H]− ions of 18:0/27:1-DAT. The ions of m/z 731 (Figure 4a) arising from loss of 18:0-FA as a ketene is more prominent than the ions of m/z 607 arising from analogous 27:1-ketene loss. This preferential formation of m/z 731 corresponding to loss of the FA-ketene at C2 over ions of m/z 607 from the FA-ketene loss at C3 was seen in all the MS<sup>2</sup> spectra of the [M + HCO2] − ions of DAT, providing useful information for distinction of the FA substituents at C2 and C3. The ketene loss process probably involves the participation of HCO2 −, which attracts the labile α-hydrogen on the fatty acid group to eliminate FA-ketene and HCO2H simultaneously (Scheme 2). Therefore, the low abundance of the ions of m/z 607 arising from loss of the FA-ketene at C3 may reflect the fact that the 27:1-FA substituent at C3 contains an α-methyl side chain [19,21,22,31], and does not contain labile α-hydrogen required for ketene loss. This is in contrast to the 18:0-FA substituent at C2, which possesses two α-hydrogens (Scheme 2).

**Figure 3.** The MS<sup>2</sup> spectrum of the [M + Na]+ ions of m/z 979 (**a**), its MS<sup>3</sup> spectra of the ions of m/z 817 (979 → 817)) (**b**), of m/z 697 (979 → 697) (**c**), its MS<sup>4</sup> spectrum of the ions of m/z 535 (979 → 817 → 535) (**d**); and the MS<sup>3</sup> spectrum of the ions of m/z 695 (979 → 695) (**e**).

**Figure 4.** The negative-ion MS<sup>2</sup> spectrum of the [M + HCO2]− ions of 18:0/27:1-DAT at m/z 1043 (**a**), its MS<sup>3</sup> spectrum of the ions of m/z 997 (1043 → 997) (**b**).

**Scheme 2.** The proposed LIT MSn fragmentation processes of the [M + HCO2]− ions of 18:0/27:1-DAT at m/z 1043.

The ions at m/z 731 and 607 arising from losses of 18:0-ketene and 27:1-ketene, respectively, are absent in Figure 4b. This is consistent with the notion that the ketene loss requires the participation of HCO2<sup>−</sup>. The ketene loss pathway becomes not operative after the [M − H]− ions are formed from [M + HCO2]− by loss of HCO2H.

The spectrum (Figure 4b) also contained the prominent ions of m/z 407, representing 27:1-fatty acid carboxylic anions, and the ions of m/z 283 representing 18:0-FA carboxylate anions, along with ions of m/z 305 arising from losses of both 18:0- and 27:1-FA substituents. The preferential formation of the ions of m/z 407 (at C3) over m/z 283 (at C2) is also a reflection of the location of the fatty acid substituents on the Glc ring, leading to the assignment of 18:0/27:1-DAT structure.

#### *3.4. Recognition of Stereoisomers Applying LIT MSn on the [M + HCO2]*− *Ions*

Similarly, the MS<sup>2</sup> spectrum of the ions of m/z 1001 is dominated by the ions of m/z 955 from loss of HCO2H (Figure 5a). The MS<sup>3</sup> spectrum of the ions of m/z 955 (1001 → 955) (Figure 5b) contained ions at m/z 701, 699, 685, 673, 671, 589, 587, 575, 561, 559, similar to those seen in Figure 5a, consistent with the consecutive dissociation processes of the [M − H]− ions that eliminate the FA substituents. These ions arose from losses of 16:1-, 16:0-, 17:0-, 18:1-, 18:0-, 24:1-, 24:0-, 25:0-, 26:1, and 26:0-fatty acid substituents, respectively. The spectrum also contained the major m/z 283/365, 281/367 ion pairs, together with the minor ion pairs of m/z 255/393, 253/395, 269/379, 241/407. These structural information led to assignment of the major 18:0/24:1-, and 18:1/24:0-DAT isomers, together with the 16:0/26:1-, 16:1/26:0-, 17:0/25:1, and 15:0/27:1-DAT minor isomers. It should be noted that the absence of ions at m/z 241, 253, 255, and 269 representing 15:0-, 16:1-, 16:0-, and 17:0-carboxylate anions, respectively, in Figure 5b, are attributable to the low mass cutoff nature of an ion-trap instrument. In contrast, these ions are abundant in the HCD production ion spectrum (data not shown), similar to that obtained by a triple quadrupole instrument [32]. The location of the fatty acid substituent position on the glucose skeleton is again confirmed by observation of the ions corresponding to loss of the fatty acid substituents as ketenes. For example, ions at m/z 691 and 689 (Figure 5a) derived from losses of (HCO2H + 18:1-ketene) and (HCO2H + 18:0-ketene), respectively, pointing to the notion that both 18:1- and 18:0-FA are situated at C2 of the 18:1/24:0-, and 18:0/24:1-DAT isomers, respectively. The assigned

structure of 18:0/24:1-DAT, for example, is further confirmed by MS<sup>4</sup> on the ions of m/z 671 (1001 → 955 → 671) (Figure 5c), which formed ions of m/z 365, representing a 24:1-carboxylate anion, and ions of m/z 323 and 305 arising from further loss of 24:1-FA substituent as ketene and FA, respectively.

**Figure 5.** The negative-ion MS<sup>2</sup> spectrum of the [M + HCO2]− ions of m/z 1001 (**a**), its MS<sup>3</sup> spectrum of the ions of m/z 955 (1001 → 955) (**b**), and the MS<sup>4</sup> spectrum of the ions of m/z 671 (1001 → 955 → 671) (**c**).

#### *3.5. Characterization of Minor Species Applying LIT MSn on the [M + HCO2]*− *Ions*

Applying multiple-stage mass spectrometry (LIT MSn) for consecutive ion separation followed by CID mass spectrometry is particularly useful for characterization of minor DAT species as [M + HCO2]− ions. For example, the MS<sup>2</sup> spectrum of the [M + HCO2]− ion of the minor DAT at m/z 989 (Figure 6a) gave a major [M − H]− fragment ions at m/z 943, but the spectrum also contained many unrelated fragment ions (e.g., ions of m/z 957, 930, 921, and 905) that complicate the structural identification. These fragment ions may arise from the adjacent precursor ions admitted together with the desired DAT ions for CID, due to that the precursor ion selection window (1 Da) cannot sufficiently isolate the isobaric ions (the mass selection window and injection time govern the total ions admitted to the trap for CID and >1 Da mass selection window is often required to maintain the sensitivity). Thus, fragment ions unrelated to the targeted molecule were formed simultaneously and complicating the structure analysis. However, the MS<sup>3</sup> spectrum of m/z 943 (Figure 6b) contained only the fragment ions related to the DAT species, due to that the [M − H]− ions still retain the complete structure but have been further segregated, and fragment ions unrelated to the structure have been filtrated by another stage (MS3) isolation. In this context, the MS<sup>4</sup> spectrum of the ions of m/z 589 (989 → 943 → 589; Figure 6c), which were further "purified", becomes even more specific, due to that only the fragment ions from DAT that consists of 23:0-FA substituent at C3 were subjected to further CID. Thus, the spectrum only contained ions of m/z 283, representing a 18:0-carboxylate anion, along with ions at m/z 323 and 305, representing the dehydrated trehalose anions. These results led to specifically define the 18:0/23:0-DAT structure. The spectrum (Figure 6b) also contained the

m/z 269/367 and 255/381 ion pairs, representing the 17:0/24:0-, 16:0/25:0-FA carboxylate anion pairs, together with ions of m/z 687/561 and 673/575 ion pairs, arising from losses of 17:0/24:0-, 16:0/25:0-FA substituents, respectively. These results readily led to the assignment of 17:0/24:0- and 16:0/25:0-DAT isomeric structures.

**Figure 6.** The negative-ion MS<sup>2</sup> spectrum of the [M + HCO2]− ions of DAT at m/z 989 (**a**), its MS<sup>3</sup> spectrum of the ions of m/z 943 (989 → 943) (**b**), MS<sup>4</sup> spectrum of the ions of m/z 589 (989 → 943 → 589) (**c**). The ions of m/z 989 are minor species and its MS<sup>2</sup> spectrum contains several ions (shown in red in (a)) unrelated to the structure, but have been filtrated from higher stage MSn, as shown in (b) (MS3) and (c) (MS4).
