*3.1. Identification of the Albucidin Biosynthetic Gene Cluster*

The aim of this study was to identify the biosynthetic genes leading to the production of the nucleoside phytotoxin albucidin. For this purpose, the genome sequence of the producer strain of *Streptomyces albus* subsp. *chlorinus* NRRL B-24108 was analysed by genome-mining software [17]. This analysis led to the identification of several putative nucleoside clusters. To prove the involvement of these candidate clusters in albucidin production, they were heterologously expressed in a genetically engineered cluster-free strain *Streptomyces albus* Del14 [19] and in *Streptomyces lividans* TK24 [20]. No albucidin production was detected in the extracts of the obtained strains, indicating that either the expressed clusters were not involved in the biosynthesis of albucidin or they were not expressed in the heterologous host environment. The inactivation of the candidate clusters in the natural albucidin producer was not feasible because the strain is refractory to genetic manipulation. Considering the difficulties in identifying the albucidin gene cluster using conventional methods, an alternative approach using chemical mutagenesis was chosen.

For chemical mutagenesis of the albucidin-producing strain *S. albus* subsp. *chlorinus* NRRL B-24108, 1-methyl-3-nitro-1-nitrosoguanidine (NTG) was used. The strain in the exponential growth stage was treated with various NTG concentrations (800 μg/mL, 600 μg/mL and 300 μg/mL) for 30 min. After mutagenesis, the cells were washed with 5% thiosulfate solution and plated in dilutions on MS-agar medium for segregation of mutations. The spores of the obtained mutant populations were washed and plated on MS-agar plates in dilutions to obtain single colonies. Altogether, 4000 individual mutants were analysed for albucidin production. The mutants were cultivated on individual plates with the production medium SG agar. The metabolites were extracted with butanol, and albucidin production was assayed by HPLC-MS. Eight mutants that lost the ability to produce albucidin were identified in the course of this screening: 6-238, 6-260, 6-389, 6-444, 6-612, 6-892, 8-610 and 8-639. The genomic DNA of the obtained zero mutants was sequenced using Illumina technology. The point mutations in the genomes of the mutants were detected by mapping the sequencing reads to the reference genome of the wild type albucidin producer. Up to 100 transition mutations were identified in the genomes of the mutant strains. By comparing the mutation patterns of the separate mutants, a short genomic region was identified that was affected by point mutations in all analysed zero mutants, implying its potential involvement in albucidin production (Figure S2). The identified region contains two genes, *SACHL2\_05525* and *SACHL2\_05524*, which encode putative radical SAM proteins and were named *albA* and *albB* (Table 1, Figure 1b). The genes constitute a putative operon with the third gene *SACHL2\_05523*, which was named *albC*. The *albC* gene encodes a putative ribonucleoside-triphosphate reductase and was not affected by point mutations in the analysed zero mutants of *S. albus* subsp. *chlorinus* NRRL B-24108. The identified genes *albA* and *albB* were not a part of the nucleoside gene clusters previously identified by genome mining and analysed in this study. Interestingly, these genes were located within the DNA fragment annotated by genome mining software as a putative NRPS gene cluster.


**Table 1.** Genes encoded within the chromosomal fragment cloned in BAC 1K1.

 The locus tags refer to the genome sequence of *S. albus* subsp. *chlorinus* NRRL B-24108 available under GenBank accession number VJOK00000000.

**Figure 1.** Chromosomal fragment of *S. albus* subsp. *chlorinus* NRRL B-24108 with the albucidin biosynthetic genes. (**a**) Schematic representations of DNA fragments cloned in BACs 1K1 and 2D4; (**b**) The genes encoded within the fragment cloned in BAC 1K1. The *albA–C* operon is marked in red; and (**c**) Overview of the performed deletions within BAC 1K1.

To determine whether the identified genes *albA* and *albB* encode albucidin biosynthetic enzymes, a BAC 1K1 containing the abovementioned genes (Figure 1a) was selected from the genomic library of *S. albus* subsp. *chlorinus* NRRL B-24108. BAC 1K1 was transferred into the heterologous host strains *S. albus* Del14 and *S. lividans* TK24 by conjugation, and the production profile of the obtained strains *S. albus* 1K1 and *S. lividans* 1K1 was analysed by HPLC-MS. The production of the compound with a high-resolution mass corresponding to albucidin could be detected in the extracts of *S. albus* 1K1 (Figure S3). No production could be detected in *S. lividans* 1K1.

Due to the lack of an albucidin standard, we set out to purify the compound identified in the extracts of *S. albus* 1K1 for structure elucidation studies by NMR spectroscopy. The *S. albus* 1K1 strain was cultivated in 10 L of SG medium for 7 days. The culture supernatant was extracted with equal amount of butanol, and the obtained extract was concentrated under vacuum. Four milligrams of the compound was purified using size exclusion and reverse phase chromatography and used for subsequent NMR studies. Analysis of the recorded NMR spectra of the purified compound unequivocally demonstrated its identity as albucidin (Figure S1 and Figure 2a).

**Figure 2.** The structures of (**a**) albucidin and (**b**) oxetanocin A.

The production of albucidin by *S. albus* 1K1 gives evidence that the genes *albA* and *albB* identified by chemical mutagenesis encode albucidin biosynthetic genes. The lack of albucidin production by *S. lividans* 1K1 can be explained by differences in regulatory networks of the *S. albus* Del14 and *S. lividans* TK24 strains.

#### *3.2. Identification of the Minimal Set of Albucidin Biosynthetic Genes*

BAC 1K1, which leads to the production of albucidin under expression in the heterologous host *S. albus* Del14, contains a 32 kb chromosomal fragment from the natural albucidin producer *S. albus* subsp. *chlorinus* NRRL B-24108. Twenty-nine open reading frames were annotated in this 32 kb region (Table 1, Figure 1b). Of these genes, only two, *albA* and *albB*, were affected by point mutations in albucidin zero mutants identified in the course of the chemical mutagenesis studies. These two genes constitute a putative operon with the gene *albC,* implying that either only the genes *albA* and *albB* are necessary for albucidin production or that all three gene within the operon are required. To experimentally determine the minimal set of albucidin biosynthetic genes, a series of gene deletions was performed within the cloned region of 1K1 BAC.

The genes *albA–C* are located in the middle part of the chromosomal fragment cloned in 1K1 BAC. The *alb* operon is preceded by the genes *SACHL2\_05539–SACHL2\_05526*, followed by the genes *SACHL2\_05522–SACHL2\_05511* (Table 1). For the sake of simplicity, the 29 genes *SACHL2\_05539–SACHL2\_05511* cloned in the BAC 1K1 will be designated in the text according to their sequence number (1 to 29). (Figure 1b) To determine which genes within the 1K1 cloned fragment are essential for albucidin production, five deletions (LS, KO14, KO15, KO16 and RS) were performed in the 1K1 BAC yielding the BACs 1K1\_LS, 1K1\_KO14, 1K1\_KO15, 1K1\_KO16 and 1K1\_RS (Figure 1c). In BAC 1K1\_LS, the left shoulder encompassing genes 1–13 was substituted by an ampicillin resistance marker (Figure 1c). The genes 14, *albA* (gene 15) and *albB* (gene 16) were substituted with the ampicillin resistance gene in BACs 1K1\_KO14, 1K1\_KO15 and 1K1\_KO16, respectively (Figure 1c). In the BAC 1K1\_RS, the right shoulder encompassing genes *albC* (gene 17)-28 was substituted by the hygromycin resistance marker (Figure 1c). The constructed BACs were transferred separately into the *S. albus* Del14 strain by conjugation, and the albucidin production of the resulting strains was assayed by HPLC-MS.

The deletion of gene 14 did not affect albucidin production in *S. albus* 1K1\_KO14 (Figure S4B). As expected from the results of chemical mutagenesis, inactivation of the genes *albA* (gene 15) and *albB* (gene 16) completely abolished albucidin production in the strains *S. albus* 1K1\_KO15 and *S. albus* 1K1\_KO16 (Figure S4C,D). This unambiguously demonstrates the essential role of the genes *albA* and *albB* in albucidin biosynthesis.

Deletion of the genes *albC* (gene 17)-28 did not affect albucidin production by the *S. albus* RS strain (Figure S4F). It was expected from the gene annotation and results of chemical mutagenesis that the genes 18–28 do not participate in albucidin biosynthesis. However, the dispensability of the gene *albC* (gene 17) is surprising since it belongs to the same operon as the essential genes *albA* (gene 15) and *albB* (gene 16). The deletion of the *albC* gene (gene 17) might be cross-complemented by an unidentified gene in the genome of the host strain *S. albus* Del14.

Albucidin production was heavily abolished in the strain *S. albus* 1K1\_LS (Figure S4E), implying that at least one of the genes 1–13 that were deleted in the BAC 1K1\_LS might be essential for albucidin biosynthesis. No genes encoding regulatory proteins or structural enzymes that might participate in nucleoside biosynthesis were identified in close proximity to the *albA–C* operon. To identify the genes within the deleted LS region that influence albucidin production, a BAC 2D4 was isolated from the genomic library of *S. albus* subsp. *chlorinus* NRRL B-24108. The chromosomal fragment cloned in BAC 2D4 overlaps with the fragment cloned in BAC 1K1 and covers the *albA–C* operon (Figure 1a). In contrast to 1K1, 2D4 BAC lacks genes 1–6, which are present in the deleted LS region. BAC 2D4 was transferred into *S. albus* Del14. Albucidin production could be detected in the extracts of the obtained strain *S. albus* 2D4 by HPLC-MS (Figure S5). This indicates that the genes 1–6 within the LS region are not involved in albucidin production and that one of the genes among 7–13 is responsible for the abolishment of albucidin production in *S. albus* 1K1\_LS.

To identify which of the genes 7–13 is involved in albucidin biosynthesis, each of them was individually substituted by an ampicillin resistance marker in 1K1 BAC yielding 1K1\_KO7, 1K1\_KO8, 1K1\_KO9, 1K1\_KO10, 1K1\_KO11, 1K1\_KO12 and 1K1\_KO13 (Figure 1b,c). The constructed BACs were transferred into the *S. albus* Del14 strain, and the albucidin production was analysed. The albucidin

production levels of all obtained strains, except *S. albus* 1K1\_KO12, were in the range of *S. albus* 1K1 harbouring the unmodified BAC (Figure S6). Albucidin production was abolished in *S. albus* 1K1\_KO12 (Figure S6G), indicating that gene 12 is responsible for the detrimental effect of the LS deletion on albucidin biosynthesis. No enzymatic activity could be assigned to the peptide product of gene 12 using blast analysis. The product also did not show homology to any known regulatory protein. Considering this, it was proposed that only the genes *albA* and *albB* encode structural enzymes essential for albucidin production in the heterologous host *S. albus* Del14 and that the product of the gene 12 elicits a regulatory effect on transcription of the *albA–C* operon through a mechanism that is not understood. To prove this, a BAC 1K1\_alb\_act was constructed containing only *albA–C* genes under the control of a strong promoter. The genes downstream of the *albA–C* operon (genes 18–28) were substituted with the ampicillin resistance gene and the genes upstream of the operon (genes 1–14) were substituted with the hygromycin resistance gene (Figure 1c). The hygromycin resistance gene used was under the control of the strong synthetic promoter TS81 and did not contain a terminator at its 3'-end [21]. The insertion of the hygromycin marker in front of the *albA–C* genes was performed in the orientation, which enabled their read-through from the TS81 promoter and their transcriptional activation. The constructed BAC 1K1\_alb\_act was transferred into the heterologous host strain *S. albus* Del14. The production of albucidin was detected in the extracts of the obtained strain *S. albus* 1K1\_alb\_act by HPLC-MS (Figure S7). Three times increase of albucidin production was observed in the strain *S. albus* 1K1\_alb\_act compared to *S. albus* 1K1 containing non-modified albucidin cluster (Figure S7). Taking into account that the total recovered albucidin yield from the *S. albus* 1K1 strain was approximately 0.4 mg/L, the calculated albucidin production by *S. albus* 1K1\_alb\_act corresponded to 1.2 mg/L. The albucidin production rate of 2 mg/L was reported for the original producer *Streptomyces albus* subsp. *chlorinus* NRRL B-24108 [10].

Albucidin production by the strain *S. albus* 1K1\_alb\_act clearly demonstrates that the genes *albA* and *albB* constitute the minimal set of the genes required for albucidin biosynthesis in heterologous host *S. albus* Del14. The role of the gene *albC* in albucidin biosynthesis is not completely understood. Because *albC* constitutes a single operon with *albA* and *albB* and its product shows homology to nucleotide biosynthetic enzymes, it cannot be completely excluded that the *albC* is involved in albucidin production in the natural producer. However, the deletion of *albC* has no effect on albucidin production in heterologous host.

The identification of the minimal set of albucidin biosynthetic genes allows its expression in various heterologous chassis strains as well as rational construction of albucidin overproducers. The engineering of the albucidin biosynthetic genes can be performed in *E. coli* and the obtained constructs can be heterologously expressed in *Streptomyces* hosts. In contrast to the genetically intractable original albucidin producer *Streptomyces albus* subsp. *chlorinus* NRRL B-24108, commonly used heterologous strains possess a well-established toolkit for their genetic manipulation. This opens the possibility to engineer their metabolic network to increase the intracellular levels of biosynthetic precursors and therefore to increase the production yields. The heterologous strains are often characterized by the simplified metabolic background which provides better detection limits for heterologously expressed compounds than the original producers, higher product yields and simplified downstream processing. Construction of the albucidin overproducers based on heterologous expression hosts is not necessarily limited to a rational approach. The chassis strains expressing heterologous cluster may be also subjected to classical mutagenesis and screening for overproducing clones.
