**2. Results and Discussion**

Concentrated EtOH extract of the sponge *Halichondria vansoesti* was partitioned between aqueous EtOH and *n*-hexane. The aqueous EtOH-soluble materials were further applied on a reversed-phase column chromatography (YMC-gel) and eluted successively with H2O→EtOH:H2O (3:7)→EtOH:H2O (2:3)→EtOH:H2O (1:1)→EtOH:H2O (3:2) resulting in several subfractions. Subfractions obtained by elution with EtOH:H2O (3:7) to EtOH:H2O (6:4) were further purified by a reversed-phase HPLC (YMC-ODS-A) to give **1**, **2** and **4**–**10**. The subfraction, eluted with H2O, was further extracted with BuOH, after which the butanol solution was concentrated and subjected to a reversed-phase HPLC (YMC-ODS-A) to obtain 3.

The molecular formula of **1**, C30H44NNa3O14S3, was established from the [M3Na − Na]<sup>−</sup> ion peak at *m*/*z* 784.1724 in the (−)-HRESIMS. In addition, the peaks at *m*/*z* 380.5927 and 246.0657, corresponding to doubly-, and triply-charged ions ([M3Na <sup>−</sup> 2Na]2<sup>−</sup> and [M3Na <sup>−</sup> 3Na]3−), respectively, were indicated in the (−)-HRESIMS of **1** (Figure S2).

The data of 1D- and 2D-NMR spectra of **1** (Tables 1 and 2, Figures S3–S7) indicated that this compound contained five methyl groups, including three angular methyl groups in the steroid nucleus (δ<sup>H</sup> 0.70/δ<sup>C</sup> 15.6, δ<sup>H</sup> 0.82/δ<sup>C</sup> 19.4, δ<sup>H</sup> 1.44/δ<sup>C</sup> 26.0) and two methyl groups of the side chain (δ<sup>H</sup> 0.92/δ<sup>C</sup> 19.5, δ<sup>H</sup> 1.13/δ<sup>C</sup> 20.1), eight methylene groups (including a *N*-substituted methylene), eleven methine groups, including four oxygenated methines (δ<sup>H</sup> 4.98/δ<sup>C</sup> 76.4, δ<sup>H</sup> 4.83/δ<sup>C</sup> 76.6, δ<sup>H</sup> 4.78/δ<sup>C</sup> 77.1, δ<sup>H</sup> 4.49/δ<sup>C</sup> 69.2), three quaternarysp<sup>3</sup> carbons (δ<sup>C</sup> 15.6, δ<sup>C</sup> 26.0, δ<sup>C</sup> 19.4), two trisubstituted double bonds (δ<sup>H</sup> 5.35/δ<sup>C</sup> 118.2, 147.1, δ<sup>H</sup> 6.85/δ<sup>C</sup> 139.4, 144.3), and an amide carbon (δ<sup>C</sup> 177.6).


**Table 1.** 1H NMR data for **1**–**4**, **8** and **10**.

<sup>a</sup> Measured in CD3OD at 500 MHz. <sup>b</sup> Measured in a mixture of CD3OD + CDCl3 (~10:1) at 700 MHz. <sup>c</sup> Measured in CD3OD at 700 MHz.


**Table 2.** 13C NMR Data of **1**–**4**, **8**, and **10**.

<sup>a</sup> Measured in CD3OD at 175 MHz. <sup>b</sup> Measured in a mixture of CD3OD + CDCl3 (~10:1) at 175 MHz.

Further analysis of the 1D- and 2D-NMR data of **1**, and the comparison of its NMR data with those in the literature revealed that **1** contains a Δ9(11)-4β-hydroxy-14α-methyl-2β,3α,6α-trisulfated steroid core (Figure 2, substructure I) and the same side chain, containing C-20 (21) to C-24 (29) (Figure 2, substructure III), as that found in the previously described topsentiasterol sulfates A–E [14], Sch 575867 [16], spheciosterol sulfates A–C [17], and chloro- and iodotopsentiasterol sulfates D (**5**,**6**) [18]. The 1H and 13C NMR spectra of **1** (Tables 1 and 2, Figures S3 and S4) were almost identical to those of topsentiasterol sulfate C [15]. The only exceptions were the signals of the protons linked to C-27 and C-28, which were shifted to lower frequencies (δ<sup>H</sup> 6.85/δ<sup>C</sup> 139.4, δ<sup>H</sup> 3.92/δ<sup>C</sup> 48.3) in the spectrum of **1**. Moreover, the HRESIMS (Figure S2) data showed that the molecular mass of **1** was 1 amu less than that of topsentiasterol sulfate C. Based on the above data, and in combination with 2D NMR data (Figures S5–S7), the presence of the 1,5-dihydro-*2H*-pyrrol-2-one portion in the terminal part of the side chain of **1** (Figure 2, substructure III) was proposed. To the best of our knowledge, this is the first report on the 1,5-dihydro-*2H*-pyrrol-2-one moiety found in polysulfated steroids from sponges. Comparison of the NOESY (Figure S8) data of the steroid nucleus of **1** with those of topsentiasterol sulfate C and the related analogs [15–18] suggests that all the stereogenic centers of these compounds have the same relative configurations. Key NOESY correlations of the steroid core of **1** are shown in Figure 3. Thus, **1** is a new analogue of the topsentiasterol sulfate C [15], containing a unique structural element with a nitrogen atom in the side chain. Therefore, it was named topsentiasterol sulfate G.

**Figure 2.** Substructures of **1**–**4** and **8**–**10** with the key COSY (bold line) and HMBC (arrow line) correlations.

**Figure 3.** Key NOESY correlations of **1** and **10**.

Detailed studies of the 1D- and 2D-NMR spectra of **1**–**4**, **8** and **9** (including the determination of the relative configuration of stereogenic centers using NOESY data, Tables 1 and 2, Figure 3) were performed. The generated data were compared to those of the known analogs, such as topsentiasterol sulfates A–E [15], Sch 575867 [16], spheciosterol sulfates A–C [17], chlorotopsentiasterol sulfate D (**5**), and iodotopsentiasterol sulfate D (**6**) [18]. Indeed, signals of the steroid nuclei in these compounds and in the isolated polysulfated steroids were almost identical. Therefore, it was proposed that **1**–**4**, **8** and **9** have the same Δ9(11)-4β-hydroxy-14α-methyl-2β,3α,6α-trisulfated steroid nucleus, with a variation of the side chain.

The molecular formula of **2**, C32H47Na3O16S3, was established from the [M3Na − Na]<sup>−</sup> ion peak at *m*/*z* 829.1834 in the (−)-HRESIMS. In addition, the peaks at *m*/*z* 403.0980 and 261.0694 in the (−)-HRESIMS of **2** were observed, corresponding to the doubly- and triply-charged ions ([M3Na − 2Na]2<sup>−</sup> and [M3Na <sup>−</sup> 3Na]3−, respectively) (Figure S9).

The 1H and 13C NMR data (CD3OD, Tables 1 and 2, Figures S10 and S11) of the side chain of **2** resemble those of topsentiasterol sulfate A [15], except for the presence of the methyl group at δ<sup>H</sup> 1.24 t, *J* = 7.1/δ<sup>C</sup> 16.1 (C-32) and a methylene group at δ<sup>H</sup> 3.74, 3.85/δ<sup>C</sup> 67.1 (C-31). Further analyses of the 2D-NMR spectral data, including COSY and HMBC spectra (Figures S12 and S14), revealed the following correlations: H-32/H-31, H-31/C-32, H-31/C-28, H-28/C-31 (Figure 2, substructure IV). In addition, HRESIMS spectrum showed that the molecular weight of **2** is 28 amu more than that of topsentiasterol sulfate A [15]. Based on these data, **2** was elucidated as the ethyl ester of topsentiasterol sulfate A [15]. Since **2** has not been previously reported, it was named topsentiasterol sulfate I.

The molecular formula of **3**, C30H43Na3O17S3, was established from the [M3Na − H]<sup>−</sup> ion peak at *m*/*z* 839.1253 in the (−)-HRESIMS spectrum. In addition, the peaks at *m*/*z* 408.0685 and 264.3826 in the HRESIMS of **3** corresponding to the doubly- and triply-charged ions ([M3Na-Na-H]2<sup>−</sup> and [M3Na-2Na-H]<sup>3</sup>−, respectively) were observed (Figure S16).

The 13C and 1H NMR of **3** (CD3OD+CDCl3, ~10:1, Tables 1 and 2, Figures S17 and S18) exhibited the signals of two carboxyl carbon (δ<sup>C</sup> 166.8 and 171.9) and a trisubstituted double bond (δ<sup>H</sup> 5.92/δ<sup>C</sup> 125.9, 157.9). The HMBC spectrum of **3**, recorded in DMSO-*d*<sup>6</sup> (Figure S21), displayed correlations from H-29 to C-26 and H-27 to C-28. Based on these data and mass spectrometry data, the presence of 2-substituted maleic acid in the side chain of **3** was suggested (Figure 2, substructure V). The *Z*-configuration of the double bond in this fragment was established using NOESY experiment, in which a correlation from H-29 to H-27 was observed (Figure S22).

To confirm the structure of **3**, a methylation with diazomethane was carried out. The structure of the resulting product **3a** was clarified using 2D-NMR and HRESIMS. Cross peaks from OMe-26 to C-26 and OMe-28 to C-28 were observed in the HMBC spectrum (Figure 2, substructure Va). In addition, the peaks of the [M3Na <sup>−</sup> Na]−and [M3Na <sup>−</sup> 2Na]2<sup>−</sup> ions were observed in the (−)-HRESIMS at *m*/*z* 845.1764 and 411.0943, respectively (Figure S23). These data revealed that a dimethyl maleate was at the terminal of the side chain of the methylated derivative, as in the previously described topsensterol A, a polyhydroxylated steroid from the sponge *Topsentia sp*. [27]. Additionally, the desulfation reaction of **3** with trifluoroacetic acid was carried out. The structure of the obtained product (**11**) was established from the analysis of the HRESIMS data (Figure 4, Figure S24).

**Figure 4.** The scheme of the desulfation reaction of **3**.

Thus, **3** is a new analogue of polysulfated steroids from sponges, with a 2-substituted maleic acid in the terminal part of the side chain.

Compound **4** was isolated as an inseparable mixture with the previously reported chlorotopsentiasterol sulfate D (**5**) and iodotopsentiasterol sulfate D (**6**) [18] (2:7:1). Detailed analysis of the HRESIMS (Figure S25), 1D- and 2D-NMR spectra (Tables 1 and 2, Figures S26–S31) of the mixture, as well as the comparison of these data with those for the previously described compounds [15–18], led to the identification of the structure **4**. The molecular formula of **4** was determined as C30H42BrNa3O14S3 from the (−)-HRESIMS whose peaks of singly-, doubly-, and triply-charged ions were observed (*m*/*z* 847.0736, [M3Na <sup>−</sup> Na]−, *m*/*z* 412.0424, [M3Na <sup>−</sup> 2Na]2−, *m*/*z* 267.0321, [M3Na <sup>−</sup> 3Na]3−, respectively) (Figure S25). Measured intensities of the isotope peaks of **4** (412.0424 (100.0%), 412.5440 (36.3%), 413.0415 (122.0%), 413.5429 (41.6%), 414.0391 (61.7%) is in a good agreement with the calculation intensities of the isotope peaks for [M3Na <sup>−</sup> 2Na]2<sup>−</sup> (412.0414 (100%), 412.5430 (35.8%), 413.0406 (119.8%), 413.5420 (41.3%), 414.0404 (24.0%)). The 1H NMR spectrum (CD3OD, Table 1, Figure S26) displayed the higher frequency-shifted three pairs of doublets corresponding to H-27 and H-28 of bromo-, chloro-, and iodo- of 3-substituted furans. Comparison of the chemical shift values of these signals to those of chloro-and iodotopsentiasterol sulfates D from the literature [18] allowed us to assign the proton signals at δ<sup>H</sup> 6.39 (H-27) and 7.48 (H-28) to **4** (Table 1). Using the integration of the signals corresponding to H-27 and H-28 in the 1H spectra of the mixture containing **4**, **5**, and **6** showed that the mixture contains about 20% bromotopsentiasterol sulfate D (**4**) and 70% and 10% of chloroand iodo-derivatives (**5**,**6**) [18], respectively. Additional interpretation of the COSY, HSQC, and HMBC data confirmed that **4** is composed of the substructures I and VI (Figure 2, Figures S28–S30).

New trisulfated steroids, dichlorotopsentiasterol sulfate D (**8**) and bromochlorotopsentiasterol sulfate D (**9**), were isolated as an inseparable mixture. Attempts to separate **8** and **9** using repetitive HPLC failed, however, based on HRESIMS and 1D- and 2D-NMR data, it was estimated as a 9:1 mixture of **8** and **9**. The molecular formulae of the **8** and **9**, C30H41Cl2Na3O14S3 and C30H41ClBrNa3O14S3, were established from the [M3Na − Na]<sup>−</sup> ion peaks at *m*/*z* 837.0837 and 881.0340 of the (−)-HRESIMS spectrum. The predominant peaks at *m*/*z* 407.0480 and 429.0227 corresponded to a doubly-charged ions [M3Na <sup>−</sup> 2Na]2−, similar to that in the MS of some pentacyclic guanidine alkaloids [28–31], and two-headed sphingolipids [32]. Moreover, triply-charged ions [M3Na <sup>−</sup> 3Na]3<sup>−</sup> in the spectra of both compounds were also observed (*m*/*z* 263.7026 and 278.3516, respectively) (Figure S32). Intensities of the isotope peaks calculated for **8** confirm the proposed molecular formula C30H41Cl2Na3O14S3 (measured: 407.0480 (100%), 407.5496 (37.2%), 408.0485 (87.6%), 408.5499 (30.8%), 409.0471 (26.6%); calculated for [M3Na <sup>−</sup> 2Na]2−: 407.0472 (100%), 407.5488 (35.8%), 408.0461 (86.5%), 408.5474 (29.36%), 409.0452 (26.7%)). Intensities of the isotope peaks calculated for **9** confirm the proposed molecular formula C30H41ClBrNa3O14S3 (measured: 429.0227 (100.0%), 429.5244 (35.3%), 430.0219 (149.4%), 430.5233 (52.0%), 431.0213 (61.9%), 431.5225 (19.7%); calculated for [M3Na <sup>−</sup> 2Na]2−: 429.0220 (100.0%), 429.5235 (35.8%), 430.0210 (151.8%), 430.5224 (52.7%), 431.0201 (62.3%), 431.5213 (19.9%)).

The 1H and 13C NMR spectra of the mixture of **8** and **9** (CD3OD, Tables 1 and 2, Figures S33 and S34) closely resembled those of chlorotopsentiasterol sulfate D (**5**) [18]. The main differences between the NMR spectra of these compounds were the singlet of H-27 at δ<sup>H</sup> 6.31 for **8** and δ<sup>H</sup> 6.44 for **9** (integrating these signals, a ratio of **8** to **9** was established as 9:1), instead of two characteristic doublets at δ<sup>H</sup> 6.39 and 7.36, corresponding to H-27 and H-28 in the 1H NMR spectrum of monochlorinated compound **5** [18]. Analysis of the COSY, HSQC, and HMBC spectrum confirmed the substructures I and VII (Figure 2, Figures S35–S37) in **8**.

To determine the positions of the halogen atoms in **9**, we have carried out careful analysis of the 1H NMR and COSY spectra of the mixture of **4** and **5** (Table 1, Figures S28 and S37a) and detected two cross-peaks δ<sup>H</sup> 2.45 (H-24)/δ<sup>H</sup> 1.14 (H-29) corresponding of **4** (26-bromo) and **5** δ<sup>H</sup> 2.56 (H-24)/δ<sup>H</sup> 1.15 (H-29) (26-chloro) in the COSY spectrum. Therefore, in the case of a bulkier bromine substituent at C-26 the chemical shifts of H-24 and H-29 were observed in a higher field. Taking into attention, that the COSY spectra of **8** + **9** (Table 1, Figures S35 and S37a) showed only one cross-peak δ<sup>H</sup> 2.55 (H-24)/δ<sup>H</sup> 1.15 (H-29) similar to the cross-peak in the spectrum of **4**, the position of the chlorine atom at C-26 in **9** was established. Based on this data and the HRESIMS data (see above), structure **9** was assigned to the bromochlorotopsentiasterol sulfate D. Nevertheless, the localization of Cl- at C-26 and Br at C-28 in **9** need to be further confirmed.

Compounds **8** and **9** represent the first dihalogenated trisulfated steroids found in sponges.

The molecular formula of **10**, C27H45Na3O13S3, was established from the [M3Na − Na]<sup>−</sup> ion peak at *m*/*z* 719.1819 in the (−)-HRESIMS. The base peaks at *m*/*z* 348.0969 corresponded to the doubly-charged ion [M3Na <sup>−</sup> 2Na]2<sup>−</sup> (Figure S38).

Detailed analysis of the 1H and 13C NMR, COSY, HSQC, HMBC, and NOESY spectra of **10** (CD3OD, Tables 1 and 2, Figure 2, substructures II and VIII, Figures S39–S44) and a comparison of its 1H and 13C chemical shift values with those reported in the literature for the previously described trisulfated steroids [1–18], indicated that **10** is a previously unreported 4β-hydroxy derivative of halistanol sulfate C [3], which was named 4β-hydroxyhalistanol sulfate C.

Interestingly, unlike all the previously described trisulfated steroids containing 4β-hydroxy group [15–18], **10** does not contain a C-9/C-11-double bond and the α-methyl group at C-14. Thus, **10** is the first member of a new structural subgroup of trisulfated steroids from sponges.

The biosynthesis of unusual side chains of trisulfated steroids, such as **1**–**9**, could be hypothesized to originate from codisterol (**12**) (Figure S45) [33]. This process could proceed via the C-27 alkylation of **12**, followed by the proton loss and several reactions such as amination or hydratation of double bonds accompanied with cyclization, oxidation, hydrolysis, and halogenation, which would result in the formation of **1**–**6**, **8**, and **9** (Scheme 1).

**Scheme 1.** Proposed biogenesis of the side chains in **1**–**9**.

The biological activities of **3**, **7**, and **10**, as well as of the mixtures of **4** + **5** + **6** and **8** + **9** were investigated using human prostate cancer cells PC-3 and 22Rv1. PC-3 cells are known to be androgen-independent as they do not express the androgen receptor (AR(-)). 22Rv1 expresses both the androgen receptor (AR(+)), and the androgen receptor splice variant 7 (AR-V7(+)), the expression of AR-V7 mediates the resistant of this cell line to androgen-deprivation therapy [34,35]. PSA is a downstream target gene of the androgen receptor (AR) pathway. Thus, suppression of PSA expression may indicate the inhibition of AR-signaling. AR-signaling is essential for the growth and survival of a significant number of prostate cancer cell types. In fact, downregulation of AR signaling mediated by androgen withdrawal is the standard first-line therapy for advanced human prostate cancer [36]. The isolated compounds and the mixtures were found to inhibit the expression of PSA (prostate-specific antigen) in human drug-resistant 22Rv1 cells (Figure 5A). Compound **3** and the mixture of **4** + **5** + **6**, suppressed PSA expression at a concentration as low as 10 μM (Figure 5A). Note, the IC50s for all the isolated compounds determined with the MTT assay in PC-3 and 22Rv1 cells were >100 μM, which could be due to the androgen-independent nature of these particular prostate cancer cell lines.

**Figure 5.** Effects of the compounds on prostate cancer cells. (**A**): Effect on the PSA expression. 22Rv1 cells were treated with the compounds for 24 h, then the proteins were extracted and examined with Western blotting. β-Actin was used as a loading control. (**B**): Effect on glucose uptake. PC-3 cells were seeded in the 96-well plate, treated with the test compounds for 24 h in FBS- and glucose-free media, incubated with 2-NBDG, and then the fluorescence was measured. Apigenin (50 μM) was used as a positive control (Apig). Cells treated with vehicle (DMSO) were used as a control (Con). The glucose uptake was normalized to the cell viability, measured by the MTS test. Significant difference from the control is shown as follows: \* *p* < 0.05 (Student's t-test).

Additionally, **3** and **7**, as well as the mixtures of **4** + **5** + **6** and **8** + **9** suppressed glucose uptake in 22Rv1 cells (Figure 5B), whereas **10** did not exhibit this effect (data not shown). Cancer cells are characterized by increased glucose consumption, which is related to their rapid growth and metabolism [37]. Inhibition of glucose uptake either by nutrient deprivation or inhibitors, may suppress cancer cells proliferation and/or sensitize cancer cells to standard therapies. Moreover, recent studies suggested a possible crosstalk between glycolysis and AR-signaling [38]. However, cytotoxic effects and proliferation inhibition were observed only at high concentrations of the isolated compounds (data not shown). Nevertheless, due to the promising activity on AR-receptor signaling and glucose uptake, **3** and **7**, as well as the mixtures of **4** + **5** + **6** and **8** + **9** may serve as starting compounds for a development of novel prostate cancer drugs. To the best of our knowledge, this is the very first report on the ability of marine-derived steroid compounds to suppress the PSA expression/androgen receptor signaling, as well as glucose uptake in cancer cells.

#### **3. Materials and Methods**
