**3. Results**

The results of the genome sequences that were determined in this work, i.e., N2 and N3 syntrophic mixtures, those of the pure cultures DSM 1675 and DSM 1676 (isolated from the N2 and 2-K mixtures), and the pure culture DSM 1685 (from the 2-K mixture) are shown in Table 1. Knowing that *Cp. ethylica* N2 and N3 consists of a mixture of two species, we attempted to separate the raw genome data by metagenomic binning. In the case of *Cp. ethylica* N3, we obtained two valuable bins, one genome related to *Prosthecochloris* sp. (97.0% coarse consistency and 96.7% fine consistency), the other to *Desulfuromonas acetoxidans* (98.4% coarse consistency and 97.3% fine consistency), both with 100 % completeness (included in Table 1). From the *Cp. ethylica* N2 data, we only obtained the genome of the *Prosthecochloris* sp.

**Table 1.** Genome characteristics of the *Prosthecochloris* and *Desulfuromonas* genomes used in this study.


#### *3.1. Prosthecochloris Genome from the Green Sulfur Component*

Average nucleotide identity (ANIb) comparison showed that the genome sequences of the green component of the N2 and N3 mixtures were virtually the same as the one isolated from the 2-K mixture (DSM 1685) and both were similar to those of *Prosthecochloris* species, with an average nucleotide identity (ANI) of 75% to the nearest, previously determined, strain HL130 (Table 2). The ANI values of all the other *Prosthecochloris* species are well below 95%, which is the arbitrary cutoff value for species differentiation [24]. It is, therefore, likely that the genome sequence of the green component of the N2, N3, and 2-K mixture is from a new species, which we propose to be called *Prosthecochloris ethylica* comb. nov.

**Table 2.** Percentage Average Nucleotide Identity (ANIb) of *Prosthecochloris* green bacteria.


Bold means: ANI values above 95% which is the arbitrary cutoff value for species differentiation.

As expected, these genomes contained a gene for cytochrome C-555, for which the translated protein sequences of strains N2 and N3 were identical (100% identity) to that previously reported for the green component of the *Cp. ethylica* strain 2-K mixture [5], as shown in Figure 1.

A whole-genome phylogenetic tree for *Prosthecochloris* placed the green component of the N2 and N3 mixtures as nearly similar to that of strain DSM 1685, isolated from the 2-K mixture (red clade in Figure 2). This is consistent with the ANI comparisons mentioned above and clearly distinguishes this clade as separate from the other sequenced *Prosthecochloris* strains, with strain HL130 as the closest relative. This further supports the proposal of a new *Prosthecochloris* species.

> **\$** 1) AVTKADVEQYDLANGKTVYDANCASCHAAGIMGAPKTGTARKWNSRLPQ 2) MKRTVSALTLSAIFALSFGLDAQAAVTKADVEQYDLANGKTVYDANCASCHAAGIMGAPKTGTARKWNSRLPQ 3) MKRTVSALTLSAIFALSFGLDAQAAVTKADVEQYDLANGKTVYDANCASCHAAGIMGAPKTGTARKWNSRLPQ 4) MKRTVSALTLSAIFALSFGLDAQAAVTKADVEQYDLANGKTVYDANCASCHAAGIMGAPKTGTARKWNSRLPQ 1) GLATMIEKSVAGYEGEYRGSKTFMPAKGGNPDLTDKQVGDAVAYMVNEVL 2) GLATMIEKSVAGYEGEYRGSKTFMPAKGGNPDLTDKQVGDAVAYMVNEVL 3) GLATMIEKSVAGYEGEYRGSKTFMPAKGGNPDLTDKQVGDAVAYMVNEVL 4) GLATMIEKSVAGYEGEYRGSKTFMPAKGGNPDLTDKQVGDAVAYMVNEVL **%** 1) ADVVTYENKKGNVTFDHKAHAEKLGCDACHEGTPAKIAIDKKSAHKDACKTCH 2) MKKLIVAIMLVAFAATAAFAADVVTYENKKGNVTFDHKAHAEKLGCDACHEGTPAKIAIDKKSAHKDACKTCH 3) MKKLIVAIMLVAFAATAAFAADVVTYENKKGNVTFDHKAHAEKLGCDACHEGTPAKIAIDKKSAHKDACKTCH 4) MKKLIVAIMLVAFAATAAFAADVVTYENKKGNVTFDHKAHAEKLGCDACHEGTPAKIAIDKKSAHKDACKTCH 1) KSNNGPTKCGGCHIK 2) KSNNGPTKCGGCHIK 3) KSNNGPTKCGGCHIK

4) KSNNGPTKCGGCHIK

**Figure 1.** (**A**) Sequence alignment of cytochromes c5 from *Prosthecochloris*. (1) Van Beeumen et al. [5] soluble protein from 2-K mix, (2) translated genome from pure 2-K DSM 1685, (3) translated genome from N2 mix, and (4) translated genome from N3 mix. (**B**) Sequence alignment of cytochromes c-551.5 from *Desulfuromonas.* (1) Ambler [6] soluble protein from 2-K mix, (2) translated genome from pure 2-K DSM1675, (3) translated genome from pure N2 DSM1676, and (4) translated genome from N3 mix.

**Figure 2.** Whole-genome-based phylogenetic tree of all sequenced *Prosthecochloris* species. The phylogenetic tree was generated using the CodonTree method within PATRIC [22], which used PGFams as homology groups. The support values for the phylogenetic tree are generated using 100 rounds of the "Rapid bootstrapping" option of RaxML [25]. *Chlorobaculum* sp. 24CR was used as an outgroup [29]. The branch length tree scale is defined as the mean number of substitutions per site, which is an average across both nucleotide and amino acid changes. Species marked in red belong to the newly proposed *Prosthecochloris ethylica* comb. nov.

A search for protein families that are unique amongs<sup>t</sup> the *Prosthecochloris* strains using PATRIC revealed a Tad (Tight Adhesion) pili gene cluster that is found exclusively in strains N2, N3, and DSM1685 and with lower homology in strains GSB1 and CIB2401, but appears to be absent from all other *Prosthecochloris* strains. This Tad pili gene cluster consists of at least 10 genes related to Type II/IV Flp pili assembly and secretion. The synteny of the gene cluster is preserved in all these species (Figure 3), indicating an evolutionary conservation of this cluster. The closest relatives with a similar gene cluster were found to be *Pelodictyon phaeoclathratiforme* BU-1 and *Chlorobium luteolum* DSM 273. Interestingly, *P. phaeoclathratiforme* is a brown-colored *Chlorobiaceae* that was described to form net-like colonies [30].

**Figure 3.** Gene cluster organization and synteny comparison of the Tad (Tight Adhesion) genes identified in the Prosthecochloris genomes. Genes are colored according to protein family (PGFam).

Tad pili gene clusters encode a macromolecular transport system (type II secretion system). They are present in the genomes of a wide variety of both Gram-negative and Gram-positive bacteria and are involved in close adhesion of cells within biofilm formation, colonization, and, sometimes, pathogenesis [31–34]. The long filamentous fibrils are formed by bundles of individual pilus strands, consisting of the fimbrial low-molecular-weight protein Flp [35–37]. Their attachment to surfaces and other cells are expected to create an optimized environment for nutrient, metal, and electron exchange between cells [31,33]. Given the presence of these conserved genes and the synteny in the *Prosthecochloris* strains isolated from the three *Cp. ethylica* mixtures, these Tad-encoded pili likely play a similar role in the syntrophic relationship of these strains by forming cell–cell interactions.

It has been described earlier that in larger multicellular phototrophic consortia of *Chlorochromatium aggregatum*, a few large genes encode proteins that are anticipated to play an important role in the formation of close interspecies interactions [14]. This was suggested based on in silico analysis, but the exact physical role of these proteins in forming these interactions have not been described. The four involved putative ORFs showed similarities to hemagglutinins (2 ORFs), an RTX toxin and hemolysin, and were found to be some of the largest genes detected in prokaryotes and to contain multiple characteristic repeats [13,14]. When searching for large ORFs in the *Prosthecochloris* genomes that we sequenced, we found the largest ORF (coding for 2417 aa.) to be annotated as "hypothetical protein." When performing an NCBI BLASTP search using this sequence, we found it to be homologous to an "outer membrane adhesin-like protein" from *Pelodictyon phaeoclathratiforme* and a "tandem-95 repeat" protein from *Prosthecochloris aestuarii*, albeit with low protein identity (<39%). Further comparison showed that all three of these proteins are homologous to hemagglutinin/adhesin-like proteins, similar to outer membrane adhesin proteins of the RTX toxin family, which contain numerous, internally repeated, calcium-binding domains [38].

Although highly diverse in terms of structure and/or adhesives properties, outer membrane adhesins in Gram-negative bacteria are usually grouped into two main categories: the adhesins secreted through a type 1 secretion system (T1SS) and the adhesins secreted through one of the type 5 secretion systems (T5SS) [39]. The most studied of these secreted adhesins are the biofilm-associated family of proteins (Bap), which are high-molecular-weight multidomain proteins containing an N-terminal secretion signal, a core domain of highly repeated motifs, and a glycine-rich C-terminal domain. Bap family members have been shown to be involved in cell adhesion to abiotic surfaces and biofilm formation in both Gram-positive and Gram-negative bacteria (for reviews, see [40,41]). However, only a few of these proteins have been characterized experimentally. A closer look at the gene region surrounding the large *Prosthecochloris* putative adhesion protein revealed, first, that the gene is located at the end of a contig in the genomes of strains N3, N2, and DSM1685, indicating that the full size of the encoded protein might be larger than 2417 amino acids and, second, that the gene is followed by a gene encoding an agglutination protein (TolC family type I secretion outer membrane protein), an ABC transmembrane transporter (type I secretion system ATPase), and a HlyD homologous protein (type I secretion membrane fusion protein) (Figure 4). Comparison of this gene region to closely related genomes showed that *Pr. aestuarii* DSM 271 contains a complete sequence for the adhesin-like gene (encoding 4748 amino acids) and is preceded by a large protein (4983 amino acids) that is homologous to the structural toxin protein RtxA (which is a T1SS-143 repeat domain-containing protein) in a similar gene cluster. The presence of the large RtxA homologue and the adhesin-like protein in a gene cluster with other T1SS-related proteins indicates that these genes indeed encode adhesins, secreted through a type 1 secretion system.

The occurrence of the adhesin-like gene at the end of a contig in our *Prosthecochloris* genomes is likely due to the fact that these genes contain multiple sequence repeats, which are known to be a potential challenge for next-generation sequencing-based genome assembly programs [42]. When searching the N3 genome with the adhesin and RtxA toxin-like proteins from *Pr. aestuarii* DSM 271 (using BLASTP in PATRIC), we did find partial genes (annotated as hypothetical proteins) located at the ends of two other contigs (contigs 008 and 029). These partial gene-containing contigs from strain N3 align very well with the gene cluster identified in strain DSM 271 (see Figure 4), which supports the hypothesis that the missing partial adhesin gene is due to assembly software limitations. The same was true for the 2-K (DSM1685) and N2 genomes where the RtxA homologue was also found on separate contigs. We can conclude that the *Prosthecochloris* genomes from the 2-K, N2, and N3 mixtures all contain large genes similar to the putative ORFs (hemagglutinin and RTX toxin) that were described in *C. aggregatum* to be important for the formation of close interspecies interactions [13,14].

**Figure 4.** Gene cluster organization and synteny comparison of the large adhesin-like protein-encoded genes, identified in the *Prosthecochloris* N3 genome and homologues. The genes for large extracellular outer membrane adhesion protein and RtxA discussed in the text are marked in bold.

#### *3.2. Desulfuromonas Genome from the Colorless Sulfur-Reducing Component*

Based on ANIb comparisons (Table 3) and whole-genome phylogenetic analysis (Figure 5) of the colorless component in the N3 mixture, we can conclude that it is very similar to *Ds. acetoxidans* DSM 1675 and DSM 1676, previously isolated from the mixtures 2-K and N2, and that they are nearly the same as the type strain DSM 684.

The *Cp. ethylica* 2-K cytochrome C-551.5 protein [6] was identical to that of the translated gene from the *Desulfuromonas* component of the N3 mixture and from the DSM 1675 and DSM 1676 pure culture, as shown in Figure 1B, but was apparently absent from the type strain DSM 684. The C-551.5 gene synteny is conserved in the N3 mixture, DSM 1675, and DSM 1676 genomes (Figure 6), and the gene is surrounded in all of the strains by a gene for cytochrome C (PGFam\_04122568; located downstream) and a Mg/Co/Ni transporter MgtE (PGFam\_04560429; upstream and antisense). These surrounding genes are both present in strain DSM 684; however, they are each located near the end of separate contigs (Figure 6). It is, therefore, possible that the C-551.5 gene was lost during assembly of the DSM 684 genome. Further studies will be needed to conclude the presence of C-551.5 in the type strain.


Bold: ANI values above 95% which is the arbitrary cutoff value for species differentiation.

The genomes of *Desulfuromonas* strains DSM 1675, DSM 1676, DSM 684T, and the genome isolated from the N3 mixture contain many genes from unique protein families (PGFams identified in PATRIC) that are apparently not found in the other sequenced *Desulfuromonas* strains. At least 18 unique PGFams were related to the synthesis of Type IV pili of two di fferent classes. Eight of them encoded mannose-sensitive hemagglutinin (MSHA) type pili, and the other 10 encode the elements of a di fferent Type IV pili group (Table 4). Type IV pili are found on the surface of a variety of Gram-negative bacteria [44] and have been demonstrated to be important as host colonization factors, bacteriophage receptors, mediators of DNA transfer and, more recently, also as electron transfer factors over longer distances (so called e-pili) [45–47].



The type IV major pilin assembly protein PilA found exclusively in the four *Desulfuromonas* strains is significantly larger (314 residues) than the common PilA homologue found in many other bacteria (only ~60–70 aa.). The typical shorter PilA homologue was also found in all of the sequenced *Desulfuromonas* strains. This larger PilA protein (PGFam\_00056426) contains a transmembrane region (from ~residues 70–160), and a BLASTP search revealed a geopilin domain membrane protein (247 aa.; PGFam\_10038571) from *Pelobacter carbinolicus* DSM 2380 as the closest relative (49% protein identity and 67% similarity). Geopilin proteins have been implicated in direct interspecies electron transfer (e-pili) within syntrophic aggregates [48–50] and have also been shown to enhance current production in fuel cells [51].

The synteny of the geopilin-PilA protein is conserved in all four of our sequenced *Desulfuromonas* strains (Figure 7). The gene is preceded by a ferredoxin domain protein (PGFam\_00004340) and followed by DUF419 (a protein of unknown function; PGFam\_00038332), which are both also unique to these four *Desulfuromonas* strains. Ferredoxins are small proteins containing iron-sulfur clusters and function as biological "capacitors" that can accept or discharge electrons and are involved in electron transfer reactions in many organisms (for review see [52]). The presence of the ferredoxin protein directly upstream of the geopilin, which is proposed to be an electron-transfer pili (e-pili), is certainly intriguing and points towards a functional coupling of these proteins. In addition, unique PGFams for PilZ, PilX, PilV, PilP, and PilW, as well as FimT biogenesis proteins and the Pilin glycosylation protein PglB were found at other locations in our selected genomes. Using BLASTP (within PATRIC), we were able to identify one other *Desulfuromonas* strain, BM513, which contains a distant homologue of the larger PilA protein (66% sequence identity); however, the gene synteny is less conserved

(Figure 7). This latter genome was assembled from a metagenomic sample of an environmental isolate, and nothing is currently known about potential symbiotic relationships of this strain. The presence of this larger geopilin-PilA homologue and several of the type IV pili biogenesis proteins indicates that the *Desulfuromonas* strains N3, N2, 2-K, and possibly the environmental strain BM513, are able to produce a unique set of type IV pili (e-pili) that could play a role in syntrophy and electron transfer.

**Figure 7.** Gene organization comparison around the geopilin-PilA pilus assembly protein in *Desulfuromonas* genomes. Genes are colored according to protein family (PGFam). The gene for the large major pilin protein PilA, discussed in the text, is marked in bold.

The mannose-sensitive hemagglutinin (MSHA; Table 4) is likewise a member of the family of type IV pili. While the exact function of MSHA is unknown, studies have shown that *msh*A mutants of *Vibrio cholerae* are unable to produce biofilms on abiotic surfaces, and these pili might have an environmental role of survival outside the host [53]. Two homologues of MshA were found to be present in DSM 1675, DSM 1676, DSM 684, and the N3 strain, in addition to homologues for MshD, I, J, K, N, and P, which are essential for MSHA pili biosynthesis (Table 4). Although these Msh PGFams were not found by PATRIC in the other *Desulfuromonas* strains, we also identified, by performing a BLASTP search within all *Desulfuromonas* strains in PATRIC, more distant homologues (<40% aa. identity) of MshA, and found the same conserved gene cluster in the genomes of *Desulfuromonas* sp. AOP6, BM513, and *Ds. thiophila*. This indicates that the *msh* pili gene cluster might be more widespread amongs<sup>t</sup> the *Desulfuromonas* species.

Besides these pili genes, the *Desulfuromonas* genomes also contain genes for several flagellar proteins, including the flagellar assembly protein FliH, flagellar basal body P-ring formation protein FlgA, the basal body rod protein FlgF, flagellar protein FlgJ (2 copies), and a flagellar regulatory protein FleQ. These were initially found as unique families by PATRIC in strains N3, DSM 1675, DSM 1676, and DSM 684. However, when performing a BLAST search, we found homologues of them (<45% protein identity) in several of the other strains, e.g., DSM1397, AOP6, and *Ds. thiophila*. This is consistent with the earlier observations that the *Desulfuromonas acetoxidans* species in the syntrophic mixtures are motile and able to produce functional flagella [8].

#### *3.3. Comparison of the Geobacter sulfurreducens Genomes to the Desulfuromonas Genomes*

It has been shown that *Prosthecochloris* could grow by direct interspecies electron transfer from *Geobacter sulfurreducens*, a close relative of *Desulfuromonas* [54]. To elucidate potential unique features in the species that have been shown to form the *Prosthecochloris* syntrophy, we compared the *Geobacter sulfurreducens* genomes of strains PCA and KN400 to the *Desulfuromonas* genomes. This revealed a unique large cytochrome C family protein (624 aa; PGFam\_12883110) that is only present in the genomes of *Ds.* DSM1675 (2-K), DSM1676 (N2), N3, 684, AOP6, and the two *Gb. sulfurreducens* strains. An EXPASY-BLASTP search revealed that this is a multiheme cytochrome, with the closest relative being an uncharacterized cytochrome C from *Geothermobacter* sp. HR-1 (67% identity and 80% similarity), and several homologues in other *Geobacter* and *Thermodesulfovibrio* strains, but missing from

all the other *Desulfuromonas* strains. The protein contains an N-terminal signal peptide and, as shown in the alignment in Figure 8, at least 9 possible heme-binding sites (identified as CXXCH) were found to be present. The function of this large multiheme cytochrome is currently unknown and the surrounding genes consist of mainly hypothetical genes in all of the *Desulfuromonas* strains where it was found, so it does not appear to be genetically associated with any known specific pathway. The recently published genome of *Ds.* strain AOP6 showed that this strain has genes for a large c-type cytochrome and unique Type IV pili [55], although no further analysis was performed. We now found this to be the case for our four genomes of *Desulfuromonas* as well. Since the *Geobacter* species can produce type IV pili and cytochromes that directly transport electrons through the pili to crystalline Fe(III) and Mn(IV) oxides [47,56], it is possible that this related multiheme cytochrome *c* and the Type IV pili play similar electron transport roles in the *Desulfuromonas* strains that contain these specific proteins.

**Figure 8.** Alignment of the large unique multiheme cytochrome C sequences from *Desulfuromonas* genomes and their closest relatives from *Geothermobacter* and *Geobacter* species. Potential heme-binding sites are marked in yellow. Ds = *Desulfuromonas acetoxidans*.
