Isolated Compounds Structure Determination Using NMR and MS

Fourteen compounds were isolated and identified using different spectroscopic techniques including (**1**) apigenin 6-*C*-(2-deoxy-*β*-D-galactopyranoside)-7-*O*-*β*-D-quinovopyranoside, (**2**) luteolin 6-*C*-α-L-rhamnopyranosyl-(1-2)-*β*-D-fucopyranoside, (**3**) apigenin 6-*C*-*β*-Dgalactopyranoside, (**4**) apigenin 6-*C*-*α*-L-rhamnopyranosyl-(1-2)-*β*-L-fucopyranoside, (**5**) carambolaside M, (**6**) carambolaside Ia, (**7**) carambolaside J, (**8**) phloretin 3 -*C*-(2-*O*-(*E*) cinnamoyl-3-*O*-*β*-D-fucopyranosyl-4-*O*-acetyl)-*β*-D-fucopyranosyl-6 -*O*-*β*-D fucopyranosyl- (1/2)-α-L arabinofuranoside, (**9**) carambolaside I, (**10a**) carambolaside P, (**10b**) carambolaside O, (**11a**) phloretin3 -*C*-(2-*O*-(*E*)-p-coumaroyl-3-*O*-*β-*D-fucosyl-4-*O*-acetyl)-*β*-D-fucosyl-6 -*O*-(2-*O*-*β*-D-fucosyl)-α-L-arabinofuranoside, (**11b**) phloretin3 -*C*-(2-*O*-(*E*)-p-coumaroyl-3- *O*-*β-*D-fucosyl-4-*O*-acetyl)-*β*-D-fucosyl-6 -*O*-(2-*O*-*β*-D-fucosyl)-α-L-arabinofuranoside, and (**12**) carambolaside Q.

New compounds for the first time to be identified *in nature*, including compounds **1**, **8**, **11a**, and **11b**, are discussed in detail in this section. All spectral data are provided in supplementary file. Compound **1** (Figure 2) was isolated as a yellowish amorphous powder soluble in 100% MeOH. The molecular formula of compound **1** was calculated as C27H30O13, based on a deprotonated ion peak calculated at *m*/*z* 561.16137, detected at *m*/*z* 561.1614 [M-H]<sup>−</sup> (calculated C27H29O13−, error −0.1 ppm) in the HR-ESI-MS spectrum (Figure S1). Compound 1 showed two UV maximums (λmax) (MeOH) at 270 nm (Band II) and 334 nm (Band I), characteristic for a flavone skeleton [27]. The IR spectrum of compound 1 illustrated a broad band at 3431.4 cm−<sup>1</sup> and 1623 cm−1, consistent with the presence of hydroxy group and carbonyl functions [28].

The full assignment of 1H and 13C NMR data (Figures S2 and S3, Table 2) was adopted based on the analysis of the 1H-1H COSY, HSQC, and HMBC spectra (Figures S4–S6). The existence of a flavone unit could be easily assigned from the 1H NMR and 13C NMR spectra (Figures S2 and S3, Table 2) from key signals of 4 A2B2-type aromatic protons at δ 7.80 (2H, d, *J* = 8.8 Hz, H-2 /6 ) and at δ 6.77 (2H, d, *J* = 8.8 Hz, H-3 /5 ) for a *p*-disubstituted benzene ring, together with two aromatic singlets at δ 7.02 (H-8) and at δ 6.58 (H-3), referring to 6,7-disubstituted apigenin [29]. Moreover, 13C NMR (Figure S3) revealed 27 carbon resonances, which may be typical for di-glycosylated apigenin as follows: a carbon signal at δ 184.0 (C-4) assignable for a ketonic carbonyl, carbon resonances at δ 102.4 (C-3), δ 159.9 (C-5), δ 113.6 (C-6), δ 164.5 (C-7), and δ 96.4 (C-8) were similar to those reported for 6,7-disubstituted apigenin [29]. Excluding carbons of flavone unit, another 12 carbons were left assigned to two sugar moieties for deoxy-hexopyranosyl units. δ (Chemical shift) and *J* (coupling constant) values as well as the 1H-1H COSY spectrum (Figure S4) identified the first hexose moiety as 2-deoxy-β-D-galactose. The signals for an anomeric proton at δ 5.10 (dd, *J* = 12.1, 2.4 Hz, H-1), two protons at δ 2.83 (q, *J* = 12.1 Hz, H1-2) and at δ 1.59 (m, H2-2), two protons at δ 4.02 (dd, *J* = 12.1, 2.1 Hz, H1-6) and at δ 3.74 (dd, *J* = 12.1, 6.4 Hz, H2-6), six carbons at δ 70.5 (C-1), 32.3 (C-2), 71.6 (C-3), 78.7 (C-4), 76.1 (C-5), and δ 62.8 (C-6) are consistent with those of a 2-deoxy-β-D-galactose [30] attached at the C-6 position in apigenin via a C-glycosidic linkage, confirmed via HMBC correlations (Figure 3). The second sugar was assigned as β-quinovopyranose attached to carbon 7 via an oxygen bridge, based on its anomeric proton and carbon at δ 4.92 (1H, d, *J* = 7.7 Hz, H-1) and δ 103.8 (C-1). Further, methyl protons at δ 1.26 (3H, d, *J* = 6.5 Hz, H-6), four oxymethine carbons at δ 75.0 (C-2), 77.1 (C-3), 71.8 (C-4), and 72.1 (C-5), and a methyl carbon

at δ 17.9 (C-6) confirmed sugar constitution. This sugar unit was identified from large axial–axial coupling constants revealing the axial orientation of all the ring protons of this unit, in agreement with the literature [31]. Hence, compound **1** was identified as apigenin 6-*C*-(2-deoxy-β-D-galactopyranoside)-7-*O*-β-D-quinovopyranoside, a new compound first time to be isolated in planta.

**Figure 2.** Chemical structures of compounds **1**–**12** isolated from CLL extract.

**Figure 3.** Key HMBC correlations of compound **1**, **8**, and **11a**.


**Table 2.** 1H (600 MHz) and 13C NMR (150 MHz) data of compounds **1**–**4** in CD3OD.

Another novel dihydrochalcone reported for the first time is compound **8** (Figure 2), isolated as yellowish amorphous powder soluble in 100% MeOH, with an estimated molecular formula of C49H60O23, based on a deprotonated ion peak calculated at *m*/*z* 1015.34526 and detected at *m*/*z* 1015.3460 [M-H]<sup>−</sup> (calculated C49H59O23−, error −0.7 ppm) in the HR-ESI-MS spectrum. The IR spectrum of compound 8 showed absorption bands ascribable to hydroxyl (3432 cm<sup>−</sup>1) and carbonyl moeity (1616 cm−1) [28]. The 13C/DEPT spectrum (Figure S40) revealed the presence of 49 carbon atoms with 48 directly attached protons (3 × CH2, 11 × C, 31 × CH, 4 × CH3). Analysis of the 1H-1H COSY, HSQC, and HMBC spectra led to their assignment as the dihydrochalcone skeleton [28] (Figures S41–S43, Table 3). Typical signals for the dihydrochalcone of four A2B2-type aromatic protons were detected at δ 6.91 (1H, br.s, H-2), δ 7.09 (1H, d, *J* = 8.4 Hz, H-6), 6.62 (1H, br.s, H-3), and δ 6.70 (1H, d, *J* = 8.4 Hz, H-5), with an aromatic proton singlet at δ 6.00 (H-5 ) and four aliphatic methylene protons at δ 2.73 and δ 2.68 (H2-7) and δ 3.19 and δ 3.35 (H2-8) (Figure S38, Table 3). Further, 13C NMR (Figure S39, Table 3) revealed the signal at δ 205.1 (C-9) of a ketonic carbonyl and two aliphatic methylene carbons at δ 30.8

(C-7) and 49.1 (C-8), confirming the 3 -substituted phloretin structure. The presence of two trans-olefinic protons at δ 7.51 (1H, d, *J* = 16.0 Hz, H-7) and 6.26 (1H, d, *J* = 16.0 Hz, H-8), five aromatic protons at δ 7.51 (2H, d, *J* = 3.8 Hz, H-2/6) and 7.38 (3H, m, H-3/4/5), a carboxyl carbon at δ 168 (C-9), two olefinic carbons at δ 146 (C-7) and 119.3 (C-8), six aromatic carbons, typical for a cinnamoyl unit [32] and a resonance for a single acetate methyl singlet at δ 2.02, and two carbons of an acetyl moiety at δ 173.6 and 20.5 [33] that suggested a phloretin-acetylated cinnamate. However, the HMBC experiment (Figure S43) could not clarify the connection of the acetyl group, most probably due to the need for using the low-temperature NMR technique [34]. Excluding carbons of dihydrochalcone, acetyl moiety, and trans cinnamoyl units, 23 carbons remained in the 13C NMR spectrum, assigned for four sugar moieties including three hexoses and a pentose. Sugars δ (chemical shift) and J (coupling constant) values as well as 1H-1H-COSY cross peaks and three hexose moieties were determined to be β-fucopyranosyls. The signals for an anomeric proton at δ 5.09 (1H, d, *J* = 9.9 Hz, H-1), a methyl proton doublet at δ 1.31 (3H, d, *J* = 4.2 Hz, H3-6), five oxymethine carbons at δ 74.1 (C-1), 71.9 (C-2), 84.5 (C-3), 74.1 (C-4), and 76.3 (C-5), and a methyl carbon at δ 17.2 (C-6) are consistent with those of a β-fucosyl moiety attached to C-3 of the phloretin moiety via a *C*-glycosidic linkage [35]. The HMBC spectrum could not though confirm this linkage, as no correlation appeared between (H-1) and C-2 , C-3 , or C-4 , which is likely attributed to the phenomenon of coexistence of two conformationally variant rotamers, due to restricted rotation around the single bond between C-9 and C-1 resulting from the steric hindrance of the cinnamoyl moiety [34]. Further, the change pattern of δ values at C-1 (−1.7 ppm), C-2 (+1.9), and C-3 (−1.7), due to the esterification in comparison to the unesterified analog (carambolaside Ja), confirmed the connection of a cinnamoyl unit at C-2 [34]. The downfield shift in C-3 (Δδ + 1.6) and C-4 (Δδ +0.5), relative to those in compound 7 (Figure S32), located the acetyl moiety at C-4. A second hexose was assigned based on its anomeric signals at δ 4.36 (1H, d, *J* = 7.6 Hz, H-1) and δ 106 (C-1), a methyl proton doublet at δ 1.26 (3H, d, *J* = 6.4 Hz, H3-6), four oxymethine carbons at δ 72.3 (C-2), 74.7 (C-3), 73 (C-4), and 72 (C-5), and a methyl carbon at δ 16.8 (C-6), as β-fucopyranose connected via an oxygen linkage [35] between C-1 and C-3, confirmed by the HMBC correlations from H-1 to C-3 (Figure 3 and Figure S43). The third sugar signals were typical for a pentose from its anomeric proton at δ 5.71 (1H, s, H-A1), four oxymethine carbons at δ 106.6 (C-A1), 92.9 (C-A2), 76.3 (C-A3), and 84.1 (C-A4), and an oxymethylene carbon at δ 62 (C-A5) annotated as a α-arabinofuranosyl moiety [36]. Lastly, signals of a third β-fucopyranosyl moiety were assigned from its anomeric signal at δ 3.97 (1H, br.s, H-F1) and δ 105.6 (C-F1), methyl protons at δ 1.29 (1H, s, H1-F6) and δ 0.75 (2H, s, H2-F6), four oxymethine carbons at δ 72.9 (C-F2), 72.2 (C-F3), 75 (F4), and 71.9 (F5), and a methyl carbon at δ 16.9 (F6).

The HMBC experiment could not clarify the connection of the acetyl group, αarabinofuranosyl moiety, or the last *β*-fucopyranosyl moiety, which warranted using the low-temperature NMR technique [34]. Altogether, compound **8** was identified as phloretin 3 -*C*-(2-*O-(E)-*cinnamoyl-3-*O*-β-D-fucopyranosyl-4-*O*-acetyl)-β-D-fucopyranosyl-6 -*O*-β-D fucopyranosyl-(1/2)-α-L arabinofuranoside.

Compound **11** (Figure 2), another novel dihydrochalcone, was isolated as a yellowish amorphous powder soluble in 100% MeOH. The molecular formula of compound **11** was established as C49H60O24, based on a deprotonated mol. ion peak calculated at *m*/*z* 1031.34018 and detected at *m*/*z* 1031.3389 [M-H]− (calculated C49H59O24−, error +1.2 ppm) in its HR-ESI-MS spectrum (Figure S55). The IR spectrum of compound 11 revealed two major absorption bands at 3432 cm−<sup>h</sup> and 1616 cm<sup>−</sup>&, consistent to hydroxyl and carbonyl moieties, respectively [28].


**Table 3.** 1H (600 MHz) and 13C NMR (150 MHz) data of compounds **5**–**9** in CD3OD isolated from CLL extract.


**Table 3.** *Cont.*

NMR spectral analysis confirmed that compound **11** existed in the form of a mixture of two diastereoisomers (**11a** and **11b**). 1H and 13C NMR data, in addition to the HSQC spectrum (Figures S56–S58), indicated a structure closely related to that of compound **8**, with an extra hydroxyl group characteristic for a p-coumaroyl moiety, instead of the cinnamoyl moiety in compound **8** existing in two diastereomers, i.e., (*E)* and (*Z*) isomers. Signals for *(E)*-isomer were assigned for the two olefinics at δ 7.45 (d, *J* = 14.3 Hz, H-7) and at 6.05 (d, *J* = 14.3 Hz, H-8), characteristic for an *(E)-p*-coumaroyl moiety in addition to four *p*-coupled aromatic protons at δ 7.36 (2H, d, *J* = 8.6 Hz, H-2/6) and 6.67 (2H, d, *J* = 8.5 Hz, H-3/5). Moreover, 13C NMR (Figure S57) exhibited a carboxyl carbon at δ 168.9 (C-9), two olefinic carbons at δ 145.2 (C-7) and 113.2 (C-8), and six aromatic carbons, typical for a *(E*)-coumaroyl unit [37]. In contrast, (*Z*)-*p*-coumaroyl moiety exhibited signals of four para-coupled aromatic protons at δ 7.19 and 7.4 (2H, s, H-2/6), unresolved peaks corresponding to H-3/5, and two olefinic protons at δ 6.69 (d, *J* = 11.2 Hz, H-7) and 5.60 (d, *J* = 11.6 Hz, H-8), coupled with a characteristic constant of *J* = 11.2 Hz. Then, 13C-NMR exhibited a carboxyl carbon at δ 168.9 (C-9), two olefinic carbons at δ 142.4 (C-7) and 114.8 (C-8), and six aromatic carbons typical for an (*Z*)*-*coumaroyl unit [37]. A resonance for a single acetate methyl singlet at δ 2.04 and two carbons of an acetyl moiety at (δ 172.0 and 19.1) were detected [33], as in compound **8**. The downfield-shifted C-3 (Δδ + 0.3) and C-4 (Δδ − 2), relative to those in compound **10** (Figure S51, Table 4), located the acetyl moiety at C-4. The HMBC experiment (Figure 3) could not confirm the connection of the acetyl group most probably due to the need for using low-temperature NMR technique, as in **8** [34]. Consequently, compound **11a** was established as phloretin 3 -*C*-(2-*O-(E)-*p-coumaroyl-3-*O*-β-D-fucosyl-4-*O*-acetyl)-β-D-fucosyl-6 -*O*-(2-*O*-β-D-fucosyl)-α-L-arabinofuranoside. Whereas **11b** was assigned as phloretin 3 -*C*-(2-*O*-*(Z)*-p-coumaroyl-3-*O*-β-D-fucosyl-4-*O*-acetyl)-β-D-fucosyl-6 -*O*-(2-*O*β-D-fucosyl)-α-L-arabinofuranoside. These compounds are reported for the first time in nature. Other identified compounds reported in the literature included carambolaside M (**5**) [11] (Figures S20–S24), carambolaside Ia (**6**) [11] (Figures S25–S29), carambolaside J (**7**) [11] (Figures S30–S36), carambolaside I (**9**) [34] (Figures S44–S48), carambolaside P and O (**10**) [11] (Figures S49–S54), carambolaside Q (**12**) [11] (Figures S61–S67), luteolin 6-C-α-L-rhamnopyranosyl-(1-2)-β-D-fucopyranoside (**2**) [38] (Figures S7–S11), apigenin 6-*C*-β-Dgalactopyranoside (**3**) [39] (Figures S12–S14) and apigenin 6-C-α-L-rhamnopyranosyl-(1-2) β-L-fucopyranoside (**4**) [38] (Figures S15–S19), by comparison of their spectroscopic data to those in previous references, but isolated for the first time from starfruit leaves.


**Table 4.** 1H (600 MHz) and 13C NMR (150 MHz) data of compounds **10**–**12** in CD3OD.


#### *2.3. Structure-Activity Relationship Assessment of Isolated Compounds as α-Glucosidase Inhibitors*

To further confirm potential efficacy of CLL compounds, isolated compounds were tested for their in vitro α-glucosidase inhibitory activity, to assess their efficacy. Considering the limitation of yield, in vivo assay was not possible to be performed. The efficacy of the isolated compounds was measured and discussed in relationship to the flavonoid structures, as discussed in the next subsections for each class separately, to identify the most crucial motifs within each for activity.
