**2. Results**

A combined approach of phenotypic screening and target screening, which we refer to as a PhenoTarget approach, was used to identify lead compounds against *Mycobacterium tuberculosis* from natural resources. The PhenoTarget approach started from the phenotypic screening of a natural product fraction library to identify fractions that were active in an HTS screen against *M. tuberculosis* H37Rv. Following phenotypic screening, native mass spectrometry was used to simultaneously identify the target protein and the molecular weight of the compound bound to the protein in pooled fractions using a panel of 37 purified mycobacterial proteins. The proteins in the panel were selected because they were all essential enzymes or virulence factors, their three-dimensional structures had been solved by the Seattle Structural Genomics Center for Infectious Diseases (SSGCID); [12,13] and SSGCID was able to supply purified proteins for the PhenoTarget screens. Screening of the pooled fractions against this protein panel of 37 putative anti-TB targets was conducted by a magnetic resonance mass spectrometry (MRMS) system equipped with an automated chip-based Nanomate system. A summary of the PhenoTarget approach is described in Figure 2.

**Figure 2.** The cascade of PhenoTarget screening for identifying lead compounds and target proteins is shown. The Nature Bank (NB) lead-like enhanced (LLE) fraction library was established following the procedure described by Camp *et al*. [14]. A high throughput phenotypic screening of 202,983 NB LLE fractions against *M. tuberculosis* H37Rv was initially performed. Active fractions with an MIC value of less than 6.1 μge/μ<sup>L</sup> were identified and chosen for protein screening against a panel of 37 putative anti-TB targets from *Mycobacteria* species. To lower sample consumption, especially protein, nine active fractions were pooled (Pool Fractions 1 to 40) and incubated with each of the target proteins. Native mass spectrometry was then used to identify free target protein (P, blue) and protein-ligand (P-L) complexes. The mass shift between the P (black) and the P-L (red) peaks provided the molecular weight of the bound ligand (L, purple) and this facilitated the isolation and identification of the active compound.

### *2.1. Native Mass Spectrometry*

Figure 3 compares the native MRMS spectra for free Rv1466, a protein associated with [Fe–S] complex assembly and repair (a) and NB LLE Pool Fraction 4 (b) and 5 (c). All three spectra contain clusters of ions with three different charged states (+8, +7, and +6). However, the spectra with Pool Fractions 4 and 5 both contain a cluster of ions shifted to high m/z values by identical amounts relative to the free Rv1466 ions. These new ions are also of higher intensity than the free Rv1466 ions and correspond to Rv1466-ligand (P–L) complexes. From the charge and m/z differences between the cluster of P and P–L ions it is possible to obtain the molecular weight of the ligand-bound to Rv1466, 233 Da. Because the ligand that was bound to Rv1466 from Pool Fraction 4 and 5 had the same molecular weight, and likely the same compound, and appeared to have a high affinity for Rv1466 as deduced by the P:P-L intensity ratio, we chose to pursue the identity of this compound.

**Figure 3.** Overlay of the native magnetic resonance mass spectrometry (MRMS) spectra for free Rv1466 (a) and Rv1466 incubated with Pool Fraction 4 (b) and 5 (c). In the spectrum of free Rv1466 (P), clusters of ions corresponding to three different charged states for Rv1466 were observed. The same cluster of ions was observed in the spectra with Pool Fraction 4 and 5 but at lower intensities. Accompanying the ions for free Rv1466 in the spectra with Pool Fractions are clusters of larger intensity ions shifted to high m/z values that correspond to Rv1466-ligand (P-L) complexes. The mass shift for the differently charged cluster pairs (P and P-L) was identical in both Pool Fractions, identifying the molecular weight of the bound ligand: Pool Fraction 4: MW = (2109.60372 – 2076.30627) × 7 = 233 Da; Pool fraction 5: MW = (2109.60378 – 2076.30627) × 7 = 233 Da.

### *2.2. Target Fraction Confirmation*

A feature common to Pool Fractions 4 and 5 was that they contained LLE fractions (LLE-2 and LLE-3, respectively) from the same marine biota, *Polycarpa aurata*, suggesting that a compound from *P. aurata* had interacted with Rv1466. A detailed high-resolution mass spectrometry (HRMS) investigation of all nine fractions from Pool Fraction 5 indicated that no fractions showed an ion at 233 m/z. In this case, MS investigation of the fractions did not confirm the presence of a ligand, so native MS was used to identify the correct fraction for isolation. Thus, five fractions were generated from a re-extraction of *P. aurata* by 95% ethanol and named Fractions A, B, C, D and E (Figure 4a). HRMS analysis of the constituents of Fraction C by a quadrupole-time-of-flight (Q-TOF) mass spectrometer identified an ion at m/z 236 as well as a higher mass ion at m/z 469 corresponding to a 468 + H<sup>+</sup> ion (Figure 4b). Another round of native MRMS screening confirmed that a compound in Fraction C interacted with Rv1466 and the molecular weight of the bound species was 233 Da (Figure 4c). Because the molecular weight of the ligand bound to Rv1466 was approximately half the molecular weight of the major species in Fraction C, our attention was drawn towards a potentially symmetrical parent compound with a molecular weight of 468 Da as the agen<sup>t</sup> reacting with Rv1466.

**Figure 4.** (**a**) Five fractions (Fractions A, B, C, D and E) were collected from a fresh 95% ethanol extraction of *P. aurata*; (**b**) the HRMS analysis of Fraction C identified ions at m/z 236 and 469; (**c**) comparison of the native MRMS spectrum for free Rv1466 (top, red) with the incubation of Rv1466 with Fraction C (bottom, blue). A similar ionization pattern was observed as described in Figure 3 with Pool Fractions 4 and 5. Ions corresponding to free Rv1466 (P) and the Rv1466-ligand complex (P-L) are labeled. Molecular weight of ligand in Fraction C: MW = (2462.51133 – 2423.66863) × 6 = 233 Da.

### *2.3. Binding Compound Isolation and Structure Elucidation*

Separation of *P. aurata* extracts by reverse phase semi-preparative HPLC (Figure 5) gave a compound identified as polycarpine, C22 H24 N6O2S2, a dimeric disulfide alkaloid previously identified from *P. autara* [15] and in a related species, *Polycarpa clavata* [16] (Figure 6). As illustrated in Figure 5a, the resonances observed in the one-dimensional 1H spectrum of pure polycarpine (bottom, red) are present in the one-dimensional 1H NMR spectrum for Fraction C (top, black). Moreover, pure polycarpine elutes with a retention time identical to a band in the LC profile for Fraction C (Figure 5b). The mass spectra of both bands are identical (Figure 5c) and consistent with a species with a molecular weight of 468 Da.

**Figure 5.** *Cont.*

**Figure 5.** (**a**) 1H NMR (recorded in DMSO-*d*6) of Fraction C (black, top) and polycarpine (red, bottom). The inset, an expansion of the downfield regions of the spectra circled in blue. (**b**) HPLC analysis of Fraction C (top) and polycarpine (bottom), (**c**) Mass spectral analysis of both HPLC bands circled in a blue rectangle from Fraction C and polycarpine showed an identical ion at 234 Da.

**Figure 6.** The structure of polycarpine.

### *2.4. Pseudo-KD Value Determination of Polycarpine with Rv1466*

An advantage of native mass spectrometry is that in addition to the qualitative identification of protein-ligand formation, the technique also provides quantitative information on the strength of the interaction [17,18]. This is because it is possible to estimate the dissociation constant, KD, from the ratio of free and bound protein observed in the MS chromatograms. It is accomplished by collecting MRMS data at a fixed protein concentration with increasing concentrations of the ligand (Figure 7a). Figure 7b displays twelve mass spectra of samples containing 9 μM Rv1466 and increasing amounts of polycarpine (0.1–300 μM). A ligand concentration was reached where the intensity of the protein-ligand complex reached a plateau. The ratios of the intensity of protein-ligand peak and sum of protein peak and protein-ligand peak were plotted against the concentration of polycarpine (Figure 7c). Using these ratios and Equations 1 and 2, a pseudo-KD of 5.3 ± 0.4 μM was calculated for polycarpine binding to Rv1466. A real KD cannot be calculated because after binding a covalent bond forms between the ligand and Rv1466 in a time-dependent manner, pushing the equilibrium to the product.

**Figure 7.** Direct determination of pseudo-*K*D for polycarpine using a dose-response curve. (**a**) Cartoon representation of the Rv1466 structure (5IRD) closest to the average structure in the calculated ensemble. The α-helices and β-strands are colored gold and blue, respectively; (**b**) Overlay of twelve mass spectra of samples containing Rv1466 (9 μM) incubated with varying concentrations of polycarpine (0.1–300 μM); (**c**) The relative mass responses of protein-ligand complex with protein, [P-L]/([P-L]+[P]), plotted against the concentration of polycarpine. The pseudo-*K*D was determined to be 5.29 ± 0.39 μM.
