**3. Discussion**

In this study, we used a set of 76 sponge samples to present our Semi-automated Prioritization of Extracts for natural Product Research (SeaPEPR) pipeline. Primary bioactivity assessment led to the identification of nine sponge extracts exhibiting bioactivity against at least one of the selected indicator strains. Simultaneously, unsupervised chemical diversity visualization by cosine similarity heat map construction facilitated the overall data interpretation and prioritization of extracts for downstream processes.

During prioritization, four bioactive extracts (KOL\_08, KOL\_16, KOL\_18, and ULU\_13) were grouped together, indicating highly similar metabolite composition. In fact, orthogonal data obtained from sponge identification by morphological features, such as spicule identification (Figure 6, Table S2), indicated taxonomic uniformity of the organisms (i.e., all specimens were identified as *A. nakamurai* based on spicule morphology). In this case, taxonomic uniformity translated into chemical uniformity. Consequently, agelasines and agelasidines (dereplicated in the active fractions of the representative extract KOL\_18) were found in all members of this metabolic group (Figure 3).

**Figure 6.** Underwater pictures and isolated spicules of the *Agelas nakamurai* cf specimens (**a**) Sample KOL\_8, (**b**) Sample KOL\_16, (**c**) Sample KOL\_18, and (**d**) Sample ULU\_13. It can be seen that the specimens are thick encrusting orange sponges and the type of spicule is megascleres acanthostyle for all four samples. This suggested the assignment as *Agelas nakamurai* cf.

On the other hand, PEHE\_5 and PANIKI\_4 were initially taxonomically classified as members of the genus *Haliclona*. In contrast, the metabolite fingerprinting of these two sponges clearly indicated distinctiveness of organisms. A focused investigation on morphological level finally revealed that PANIKI\_4 belongs to the genus *Halichondria*. Six other sponges were morphologically identified as *Haliclona* sp. However, only two pairs of high metabolic similarity could be observed in the heat map, indicating different *Haliclona* species. Within this genus, speciation seems to be tightly linked to chemical diversification, as *Haliclona* extracts did not cluster, but were distributed throughout the heat map. It is known that besides species affiliation of the holobiont, the chemical profile could also be shaped by the associated microbial communities [23], the habitat [24], as well as stress associated to predation and wounding [25].

Both observations, chemical uniformity within a species (*A. nakamurai*) and interspecies metabolic diversity (*Haliclona* sp.) can be explained by the well-accepted assumption that taxonomic, thus genetic, diversity is often expressed by chemical diversity. Broad chemical diversity is generally desired in natural product discovery campaigns and thereby careful selection of the source material is crucial. In this context, the prioritization of extracts based on the similarity of their chemical composition helps to maximize metabolite diversity in downstream processes. Especially for samples for which reliable species identification in the field (e.g., sponges) is challenging, chemotyping (e.g., cosine similarity heatmaps) as interface between primary screenings and follow up experiments seems useful to decrease workload. Besides, it has to be kept in mind that even different intra-species samples have the potential for the detection of new and even novel compounds, since analysis of the same species could result in different metabolomes due to the dynamic environmental factors [26]. Independent from the sample set, it demands a straightforward downstream pipeline to mine the vast amount of data. While other microfractionation platforms are suitable to acquire detailed information about extracts obtained from precisely selected samples such as different medicinal plants [27,28], one benefit of the herein presented pipeline is the potential to characterize extracts (and not necessarily the source organism) in detail without processing replicates and yet account for most drivers of metabolic diversity. After prioritization, extract components (ions) are directly linked to the observed

bioactivity. Other elegant strategies (e.g., bioactive molecular networking [29]) establish this connection by calculation of the Pearson correlation between the relative abundance of ions across chromatographic fractions (usually 18–20) and the observed bioactivity. Our alternative dereplication approach aims to screen fractions containing only a very limited number of, if not single, ions or ions all belonging to the same molecular feature against the indicators strain (Figures S3, S7–S10). Because fraction collection in assay plates is coupled to MS/MS, a direct, experimental connected between candidate molecule and bioactivity can be established. By using this workflow, five out of initially 76 extracts were prioritized based on bioactivity and unique metabolic fingerprint, before the causative metabolites were determined by microfractionation.

Bioactivity of extracts obtained from *A. nakamurai* could be assigned to agelasines and agelasidine A. Synthetic access to the agelasines was already established [30] and broad compound profiling was carried out: Reported bioactivities include Na,K-ATPase inhibition [9], cyto- and ichthyotoxicitiy, antiprotozoal [31], and antifouling activity, as well as growth inhibition of *M. tuberculosis*, Gram-positive and negative pathogenic bacteria [32], as well as yeast (reviewed by Gordaliza) [33]. Likewise, agelasidines were observed to exhibit activity against *S. aureus* and *C. albicans* [11]. Broad screening of aplysinopsins demonstrated a modulating activity against the glycine-gated chloride channel receptor [13], antineoplastic, antiplasmodial, anti-bacterial, as well as anti-fungal activities. The latter included growth inhibition of *Penicillium atrovenetum* and *Trichophyton mentagrophytes* (reviewed by Bialonska and Zjawiony) [12]. Besides aplysidine A, a mix of several cytotoxic [17,18] sesquiterpene hydroquinones was dereplicated in the extract PEHE\_5 obtained from *Haliclona* sp. The bioactivity of *Neopetrosia* sp. extract ULU\_16 was attributed to stevensine (odiline). Reported activity of stevensine comprises fish deterrence [34] and weak antimicrobial growth inhibition (e.g., *Deleya marina*, a common fouling bacterium) [35]. The compound 20-hydroxyhaterumadienone (here dereplicated from PANIKI\_4 a putative *Halichondria* sp.) is known to possess cytotoxic effects [22,36], and exhibit weak interaction with human lipoxygenase (5-hLO) [37].

While these results indicate a generally robust transfer of primarily observed growth inhibitory effects to microfractionated assays, two extracts did not show bioactivity in any fraction. These findings emphasize a general challenge in bioactivity driven NP research (in contrast to cheminformatics inclined discovery projects [38]): Microbial crude extracts are composed of a mixture of various substances at dramatically different concentrations and potencies. It is important to realize that almost each substance (or a combination of several metabolites) becomes unspecifically toxic at high concentrations, hence producing a positive assay read out. As discrimination between specific and unspecific effects might come at the price of insensitivity, we chose a trade off in favor of false positive instead of false negative results. Consequently, initially moderately active crude extracts (e.g., ULU\_11 against *C. albicans*) might not produce positive microfractionation read outs. Given suitable chromatography parameters, members of compound families are separated and tested individually at lower overall concentration. In the case of PEHE\_5, the microfractionated extract was unsuccessfully rescreened against *S. tritici.* Potentially, the sum of compounds present in the extract (di-brominated aplysinopsins; aureol/chromazonal) possessed additive, however unspecific, growth inhibition of the test strain, while individual compounds did not show the effect. Although the reduction of unspecific effects caused by high concentration of compounds seems to be an advantage, separation and individual testing of metabolites might also prohibit identification of synergistic effects.

Another limitation of rapid MS/MS-based annotation approaches, including the herein presented methodology, is the reduced identification confidence of target molecules (as defined by the Metabolite Annotation Task Group of the Metabolomics Society) [39,40] compared to full structure and stereochemistry assignment studies. In that sense, no distinction between the isomeric sponge metabolites aureol and chromazonarol or between putatively 6 -aureoxyaureol and 6 -aureoxychromazonarol could be made. Besides these challenges, SeaPEPR has proven its value as prioritization strategy allowing data-based decision making on follow-up projects early in the discovery process. This study gave insight into the metabolites of four morphologically seemingly different specimens of *A. nakamurai*, preventing an otherwise very daunting task of molecular structure elucidation.

If a compound exhibits the desired properties such as structural novelty, repurposing potential, or just the isolation of more material for further in detail investigation of observed bioactivities, the metabolite should undergo further analysis, including confirmation of the 3-dimensional structure and extensive activity profiling. For repurposing studies of small molecules, the required amount (~ 1 mg) to carry out experiments required for hit characterization might be generated by straightforward chemical synthesis as shown for the agelasines [30]. While an unknown and likewise bioactive metabolite is scientifically most intriguing, it initially requires more sample material; hence, detailed metadata should be recorded in the field (Table S2) to allow resupply. Collection of specimens with the same chemotype might be challenging, but not per se, as observed by the robust metabolic fingerprint of *A. nakamurai* across sampling sites (>60 km distance between Kolongan and Ulu sampling sites). Before isolation from animal tissue is conducted, metabolite access via fermentation of the cultivable microbiome should be investigated. If this route is obstructed, authorities should decide case by case whether a targeted isolation campaign from animal tissue, towards new and urgently needed antibiotic or agrochemical lead structures, is ethically justifiable. Selection of promising projects might be facilitated by data obtained from prioritization processes, such as SeaPEPR.

Finally, yet importantly, to the best of our knowledge, no bioactivity against the common plant pest *S. tritici* was reported for any of the herein dereplicated sponge compounds. The ascomycete *S. tritici*, which is the causative agent of blotch disease on wheat, is responsible for serious losses in cereal yields and quality in Western European countries. In 2014, an estimated \$1.3 billion worth of fungicides was used to control *Septoria*-induced crop rust [41]. Resistance development, strict EU regulations, and increased public awareness against the use of petrochemicals drive the continuous demand for new agents with potency against *S. tritici*. The herein presented data indicate that marine-derived natural products pose potential solutions for current challenges in plant pest control.
