**1. Introduction**

The anaerobic anoxygenic photosynthetic betaproteobacteria are represented by a small group of bacteria currently classified in the *Burkholderiales* (the genera *Rhodoferax* and *Rubrivivax*) and *Rhodocyclales* (the genus *Rhodocyclus*) orders. Genome sequences of several of the *Burkholderiales* species have been previously reported, including strains of *Rubrivivax gelatinosus* [1,2], *Rubrivivax benzoatilyticus* [3], *Rhodoferax fermentans* [4], *Rhodoferax antarcticus* [5], and *Rhodoferax jenense* [6]. There also is a genome sequence for the aerobic anoxygenic photosynthetic bacterium *Roseateles depolymerans* [7], which is related to *Rubrivivax gelatinosus*.

Though the genome sequence of the type strain of *Rhodocyclus tenuis* DSM 109<sup>T</sup> has been reported recently [8], the diversity of this group of bacteria has not been studied in

**Citation:** Kyndt, J.A.; Aviles, F.A.; Imhoff, J.F.; Künzel, S.; Neulinger, S.C.; Meyer, T.E. Comparative Genome Analysis of the Photosynthetic Betaproteobacteria of the Genus *Rhodocyclus*: Heterogeneity within Strains Assigned to *Rhodocyclus tenuis* and Description of *Rhodocyclus gracilis* sp. nov. as a New Species. *Microorganisms* **2022**, *10*, 649. https://doi.org/10.3390/ microorganisms10030649

Academic Editors: Robert Blankenship and Matthew Sattley

Received: 17 February 2022 Accepted: 15 March 2022 Published: 18 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

detail. The first and type species of the genus *Rhodocyclus, Rhodocyclus purpureus,* was discovered by Norbert Pfennig [9]. A second species of this genus was isolated and described based on a single strain as *Rhodospirillum tenue* [10] and later reclassified as *Rhodocyclus tenuis* [11]. Several strains that were assigned to *Rhodocyclus tenuis* have been isolated from various German freshwater lakes (Lake Pluss-See near Plön, a forest ditch near Grünenplan, Garrensee near Ratzeburg, Eutiner See, Nonnenmattweiher near Neuenweg Black Forest, Edebergsee near Plön) and peat bogs (Black Forest) by Hanno Biebl and Norbert Pfennig [12]. A single isolate obtained from a pond ("keyhole pond") in the Botanical Garden in Bonn was obtained by one of us (JFI). Parts of these strains have been studied previously regarding to their physiological properties [12], carotenoids [13], lipopolysaccharide structures [14], sulfate assimilation [15], and lipid and fatty acid composition [16]. Although these studies revealed the heterogeneity of the group, no systematic studies have been performed.

Here, we report on the genome analysis of *Rhodocyclus purpureus* DSM 168<sup>T</sup> and *Rhodocyclus tenuis* DSM 109<sup>T</sup> and four strains previously assigned to *Rhodocyclus tenuis* and provide evidence for the existence of a new species of *Rhodocyclus,* represented by three of the strains studied that have previously been shown to be different from the type strain of *Rcy. tenuis* [12–14].

#### **2. Materials and Methods**

#### *2.1. Origin of the Strains of Rhodocyclus Tenuis*

The type strain DSM 109<sup>T</sup> (Pfennig 2761) was isolated from a pond near Grünenplan (district Holzminden, Germany); strains DSM 110 (Pfennig 3760) and DSM 111 (Pfennig 3761) from Nonnenmattweiher pond near Neuenweg in the rural district Lörrach in the Black Forest (Germany); and strain DSM 112 (Pfennig 3661) from a kolk in the Hinterzartener Moor in the Black Forest, Germany. Strain IM 230 (Imhoff 230) was isolated by JF Imhoff from a small pond called "Schlüssellochteich" (keyhole pond) in the Botanical Garden of Bonn University.

#### *2.2. Genome Sequencing*

Genomic DNA for *Rhodocyclus tenuis* DSM 110, DSM 111, and DSM 112 was obtained from DSMZ. Cells of *Rcy. purpureus* DSM 168<sup>T</sup> and *Rcy. tenuis* IM 230 were grown under the recommended conditions [17] and DNA extracted from well-grown cultures according to Imhoff et al. [18]. The quantity and purity of DNA were determined using Qubit and Nanodrop instruments and showed 260/280 ratios between 1.80 and 1.90. The DNA libraries were prepared with the Nextera® XT DNA Sample Preparation kit from Illumina (San Diego, CA, USA) following the manufacturer's protocol.

*Rhodocylus* sp. strains DSM 110, DSM 111, and DSM 112 genomes were sequenced using 500 µL of a 1.8 pM library with an Illumina MiniSeq instrument, using paired-end sequencing (2 × 150 bp). Quality control of the reads was performed using FASTQC within BaseSpace (Illumina, San Diego; version 1.0.0), using a k-mer size of 5 and contamination filtering. The data for each were assembled de novo using SPAdes (version 3.10.0; [19,20]) for DSM 110 and DSM 111 or Unicycler within PATRIC [21,22] for DSM 112. Default k-mer lengths were used for both programs. The genome sequences were annotated using RAST (Rapid Annotations using Subsystem Technology; version 2.0; [23]). An EvalG genome quality analysis, using the checkM algorithm [24], ran during PATRIC annotation and showed an estimated 100% completeness and 0% contamination for each of these genomes.

*Rcy. purpureus* and *Rcy. tenuis* IM 230 genomes were sequenced on an Illumina MiSeq using the MiSeq® Reagent Kit v3 600 cycles sequencing chemistry (Illumina, San Diego, CA, USA), with a cluster density of approximately 1200 K/mm2. Trimmomatic v0.36 [25] was used for read quality filtering. Illumina Nextera XT adapters were removed from the reads. Quality trimming was conducted with a 5-base pair (bp) sliding window, trimming the reads with an average Phred quality score below 30. Read lengths >21 bp after quality trimming were retained. Only single reads (i.e., reads with their mate deleted) were

included into downstream analysis. Reads were further checked for ambiguous base calls, as well as for low complexity, using the DUST algorithm [26]. They were filtered accordingly with an in-house R script in Microsoft R Open v3.3.2 (R Core Team 2016). SPAdes v3.10.0 was used for the pre-assembly of the filtered reads [19,20], using default k-mer lengths. Scaffolds ≥ 500 bp of this pre-assembly were subjected to extension and second-round scaffolding with SSPACE standard v3.0 [27]. Scaffolds ≥ 2500 bp were assigned to genome bins by MetaBAT v0.32.4 [28], to ensure draft-genome purity of *Rcy. purpureus* DSM 168<sup>T</sup> . From the two resulting genome bins (3.595 and 1.048 Mbp, respectively), the larger one with a G + C content of 66 mol% was selected as the draft genome of *Rcy. purpureus*. Base coverage was determined with BBMap v36.81 (https://sourceforge.net/projects/bbmap (accessed on 29 May 2017)) [29] for filtered reads unambiguously mapped to the scaffolds of the draft genome. Estimated fold-coverage was calculated as the median base coverage over all scaffold positions.

#### *2.3. Whole Genome Comparison*

Average percentage nucleotide identity (ANIb) between the whole genomes was calculated using JSpecies [30]. A whole genome-based phylogenetic tree was generated using the CodonTree method within PATRIC [22], which used PGFams (global (cross-genus) protein families) as homology groups. At total of 445 PGFams were found among these selected genomes using the CodonTree analysis, and the aligned proteins and coding DNA from single-copy genes were used for RAxML analysis [31,32]. iTOL was used for tree visualization [33]. A proteome comparison was performed using protein sequence-based genome comparison using bidirectional BLASTP within PATRIC [22]. Average amino acid identities (AAI) values were calculated from the proteome comparison within PATRIC [22], using only bi-directional hits with *Rcy. tenuis* DSM109<sup>T</sup> as the reference strain. Digital DNA–DNA Hybridization (dDDH) data were obtained using the Type (Strain) Genome Server (TYGS) web server (https://tygs.dsmz.de (accessed on 3 April 2021)) [34]. The program used the distance formula d4 to calculate a similarity based on sequence identity.

For synteny analysis, global PATRIC PGFam families were used to generate comparative genome regions to determine a set of genes that match a focus gene [22]. All *Rhodocyclus* genomes were used in the search and were compared to the DSM 110 genome. The gene set is compared to the focus gene using BLAST and sorted by BLAST scores within PATRIC [22]. The *cbi*X (long cobaltochelatase) gene was used as a focus gene to analyze synteny of the cobalamin synthesis gene cluster.

The multiple sequence alignments for the 16S rRNA, HiPIP, and RuBisCo comparisons were performed using Clustal Omega [35]. All of the 16S rRNA sequences were genome derived. The phylogenetic tree was calculated by the neighbor-joining (NJ) method [36] within JALVIEW [37] and a Newick file was generated. iTOL was used to draw the phylogenetic trees expressed in the Newick phylogenetic tree format [33].

#### **3. Results and Discussion**

#### *3.1. Whole Genome Analysis*

The genomic features of six strains of the genus *Rhodocyclus,* including five new genome sequences and the previously sequenced *Rcy. tenuis* DSM 109<sup>T</sup> genome, were compared (summary in Table 1). Based on the genome size and the G + C content, three groups can be recognized: *Rcy. tenuis* DSM 109<sup>T</sup> and IM 230 represent group 1 and have an identical G + C content (64.7 mol%) and slightly larger genome size compared to those of group 2, with strains DSM 110, DSM 111, and DSM 112. *Rcy. purpureus* DSM 168<sup>T</sup> is the only representative of group 3 and has a similar genome size compared to the group 1 strains but a higher G + C content (66.1 mol%).


**Table 1.** Overview of genome features of all of the *Rhodocyclus* genome sequences.

The Average Nucleotide Identity (ANI) comparison revealed a 97.1% ANI of strains within group 1 and values of 98.8% and higher of strains within group 2 (Table 2). However, the comparison between strains from the two groups shows ANI values below 80%. Applying the arbitrary cutoff value for species differentiation of 95% [30], the two groups should clearly represent different species. *Rcy. purpureus* has ANI values of 80% or less with all of the other strains and is rightfully recognized as a distinct species. For comparison, *Accumulibacter phosphatis* was included (Table 2) as a species from a closely related genus [38], which showed ANI values of 72–73% with all of the *Rhodocyclus* strains.

**Table 2.** Whole-genome-based average nucleotide identity (ANI) of *Rhodocyclus* species and relatives. ANI values above the species cutoff of 95% are shown in bold.


The ANI data imply that all of the studied strains are indeed members of the genus *Rhodocyclus*, and *Rhodocyclus purpureus* and *Rhodocylus tenuis* are distinct species of this genus. However, the strains of group 2 (DSM 110, DSM 111, and DSM 112) belong to a new species of the genus *Rhodocyclus*. A whole-genome-based phylogenetic tree (Figure 1) supports these findings and shows the group of strains DSM 110, DSM 111, and DSM 112 as very close relatives to each other but apart from the type strains of *Rcy. tenuis* DSM 109<sup>T</sup> and *Rcy. purpureus* DSM 168<sup>T</sup> . This indicates that this group indeed forms a separate species of the *Rhodocyclus* genus.

DSM 110, DSM 111, and DSM 112 at the species level.

**Figure 1.** Whole‐genome‐based phylogenetic tree of all known *Rhodocyclus*, compared to representa‐ tive genomes of close relatives. One hundred rounds of the 'Rapid bootstrapping' option of RaxML were used to generate the support values for the phylogenetic tree. The branch length tree scale is defined as the mean number of substitutions per site, which is an average across both nucleotide and amino acid changes. The *Rhodocyclus* genomes are colored differently based on their ANI values (with a species cutoff of 95%). *Allochromatium vinosum* DSM 180T was added as an outgroup. **Figure 1.** Whole-genome-based phylogenetic tree of all known *Rhodocyclus*, compared to representative genomes of close relatives. One hundred rounds of the 'Rapid bootstrapping' option of RaxML were used to generate the support values for the phylogenetic tree. The branch length tree scale is defined as the mean number of substitutions per site, which is an average across both nucleotide and amino acid changes. The *Rhodocyclus* genomes are colored differently based on their ANI values (with a species cutoff of 95%). *Allochromatium vinosum* DSM 180<sup>T</sup> was added as an outgroup.

However, a much higher value of 97.6% was found when comparing the *Rcy. tenuis* IM 230 proteome (from 3058 proteins) with the DSM 109 reference proteome. These data cor‐ respond well with the other analyses and further support the distinction of the strains

A protein-sequence-based genome comparison, with the type strain of *Rcy. tenuis* DSM 109<sup>T</sup> as a reference, provided a complete proteome comparison (Figure 2). This provided amino acid sequence identity for both bi- and unidirectional hits between each genome and the reference genome (color coded in Figure 2). The average amino acid sequence identity (AAI), using only reciprocal (bi-directional) hits, was calculated for each genome, which showed a 78.4% identity with DSM 110 (from 2203 proteins), 77.8% with DSM 111 (from 2184 proteins), and 78.0% with DSM 112 (from 2185 proteins). Similar values were obtained for *Rcy. purpureus* DSM 168<sup>T</sup> with 77.0% identity (from 1938 proteins). However, a much higher value of 97.6% was found when comparing the *Rcy. tenuis* IM 230 proteome (from 3058 proteins) with the DSM 109 reference proteome. These data correspond well with the other analyses and further support the distinction of the strains DSM 110, DSM 111, and DSM 112 at the species level.

Digital DNA–DNA hybridization analyses (dDDH) showed only a distant relationship between the type strain of *Rcy. tenuis* DSM 109<sup>T</sup> and the strains DSM 110 (25.0%), DSM 111 (24.9%), and DSM 112 (25.1%). These values are similar to what was obtained between *Rcy. tenuis* DSM 109<sup>T</sup> and *Rcy. purpureus* DSM 168<sup>T</sup> (24.8%). The dDDH values between *Rcy. purpureus* and the three strains also places them distantly related, with 22.9% (DSM 110), 23.0% (DSM 111), and 23.0% (DSM 112). On the other hand, the dDDH values amongst the three strains, DSM 110, DSM 111, and DSM 112, showed very high DNA–DNA hybridization values (91–92%). Consistent with the analyses provided above, this places these three strains in a closer relationship with each other than any of the other species, equidistant from *Rcy. tenuis* DSM 109<sup>T</sup> and *Rcy. purpureus*, DSM 168<sup>T</sup> , supporting the placement of those into a separate species group.

**Figure 2.** Proteome comparison of *Rhodocyclus* species and relatives based on protein‐sequence‐ based genome comparison using bidirectional BLASTP. *Rcy. tenuis* DSM 109T was used as the refer‐ ence proteome. The percent protein identity is color‐coded for each proteome as compared to the reference proteome. *Rcy. tenuis* IM 230 showed average amino acid identities (AAI) of 97.6% (blue‐ green), while the *Rcy.* DSM 110, *Rcy*. DSM 111, *Rcy.* DSM 112, and *Rcy. purpureus* proteomes are equidistant from the reference proteome (77–78% identity; yellow‐orange). **Figure 2.** Proteome comparison of *Rhodocyclus* species and relatives based on protein-sequence-based genome comparison using bidirectional BLASTP. *Rcy. tenuis* DSM 109<sup>T</sup> was used as the reference proteome. The percent protein identity is color-coded for each proteome as compared to the reference proteome. *Rcy. tenuis* IM 230 showed average amino acid identities (AAI) of 97.6% (blue-green), while the *Rcy.* DSM 110, *Rcy*. DSM 111, *Rcy.* DSM 112, and *Rcy. purpureus* proteomes are equidistant from the reference proteome (77–78% identity; yellow-orange).

Digital DNA–DNA hybridization analyses (dDDH) showed only a distant relation‐ ship between the type strain of *Rcy. tenuis* DSM 109T and the strains DSM 110 (25.0%), DSM 111 (24.9%), and DSM 112 (25.1%). These values are similar to what was obtained between *Rcy. tenuis* DSM 109T and *Rcy. purpureus* DSM 168T (24.8%). The dDDH values between *Rcy. purpureus* and the three strains also places them distantly related, with 22.9% (DSM 110), 23.0% (DSM 111), and 23.0% (DSM 112). On the other hand, the dDDH values amongst the three strains, DSM 110, DSM 111, and DSM 112, showed very high DNA– DNA hybridization values (91–92%). Consistent with the analyses provided above, this places these three strains in a closer relationship with each other than any of the other species, equidistant from *Rcy. tenuis* DSM 109T and *Rcy. purpureus*, DSM 168T, supporting the placement of those into a separate species group. In addition to the whole‐genome‐based analyses, we also compared the 16S rRNA In addition to the whole-genome-based analyses, we also compared the 16S rRNA sequences of all *Rhodocyclus* species and related species. A 16S rRNA-based phylogenetic tree is provided in Figure 3. *Rcy. tenuis* DSM 109<sup>T</sup> and IM 230 have a 99.4% 16S rRNA identity (1541 nt. overlap), while the three strains DSM 110, DSM 111, and DSM 112 only have 97.4% identity with *Rcy. tenuis* DSM 109<sup>T</sup> (1541 nt. overlap). They do have 99.9% identity amongst themselves. The *Rcy. purpureus* 16S rRNA is equidistant from *Rcy tenuis* DSM 109<sup>T</sup> and DSM 110, with 96.6% and 96.0% identity, respectively (1551 nt. overlap). These values are below the proposed species delineation for 16S rRNA comparisons of 98.7% [39] and place *Rcy. purpureus* and the three strains, DSM 110, DSM 111, and DSM 112, on separate clades in the 16S rRNA phylogenetic tree (Figure 3), which is consistent with the whole-genome-based analyses described above.

sequences of all *Rhodocyclus* species and related species. A 16S rRNA‐based phylogenetic tree is provided in Figure 3. *Rcy. tenuis* DSM 109T and IM 230 have a 99.4% 16S rRNA identity (1541 nt. overlap), while the three strains DSM 110, DSM 111, and DSM 112 only

with the whole‐genome‐based analyses described above.

**Figure 3.** 16S rRNA phylogenetic tree for *Rhodocyclus* and related species. The phylogenetic tree was calculated by the neighbor‐joining (NJ) method [36] within Jalview [37]. iTOL was used to draw the phylogenetic trees expressed in the Newick phylogenetic tree format [33]. *Allochromatium vinosum* DSM 180T was added as an outgroup. *Rhodocyclus* species were color‐coded the same as in Figure 1. **Figure 3.** 16S rRNA phylogenetic tree for *Rhodocyclus* and related species. The phylogenetic tree was calculated by the neighbor-joining (NJ) method [36] within Jalview [37]. iTOL was used to draw the phylogenetic trees expressed in the Newick phylogenetic tree format [33]. *Allochromatium vinosum* DSM 180<sup>T</sup> was added as an outgroup. *Rhodocyclus* species were color-coded the same as in Figure 1.

have 97.4% identity with *Rcy. tenuis* DSM 109T (1541 nt. overlap). They do have 99.9% identity amongst themselves. The *Rcy. purpureus* 16S rRNA is equidistant from *Rcy tenuis* DSM 109T and DSM 110, with 96.6% and 96.0% identity, respectively (1551 nt. overlap). These values are below the proposed species delineation for 16S rRNA comparisons of 98.7% [39] and place *Rcy. purpureus* and the three strains, DSM 110, DSM 111, and DSM 112, on separate clades in the 16S rRNA phylogenetic tree (Figure 3), which is consistent

#### *3.2. Cytochrome and High Potential Iron Protein HiPIP Analysis 3.2. Cytochrome and High Potential Iron Protein HiPIP Analysis*

An interesting aspect of this study is the apparent use of different electron donors to the photosynthetic reaction center. While HiPIP is the normal electron donor to most pho‐ tosynthetic reaction centers in the *Gammaproteobacteria,* cytochrome *c2* fills this role in the *Alphaproteobacteria*. In phototrophic *Betaproteobacteria*, the situation is different. While HiPIP is the usual electron donor in *Rvi. gelatinosus*, a high potential cytochrome *c8*, which is induced under aerobic growth, may also participate under some conditions [40]. HiPIP is the electron donor to reaction centers in *Rfx. fermentans* as well [41]. *Rcy. tenuis* is known to utilize both HiPIP and cytochrome *c8* in cyclic electron transfer depending on the growth conditions [42]. An interesting aspect of this study is the apparent use of different electron donors to the photosynthetic reaction center. While HiPIP is the normal electron donor to most photosynthetic reaction centers in the *Gammaproteobacteria,* cytochrome *c<sup>2</sup>* fills this role in the *Alphaproteobacteria*. In phototrophic *Betaproteobacteria*, the situation is different. While HiPIP is the usual electron donor in *Rvi. gelatinosus*, a high potential cytochrome *c8*, which is induced under aerobic growth, may also participate under some conditions [40]. HiPIP is the electron donor to reaction centers in *Rfx. fermentans* as well [41]. *Rcy. tenuis* is known to utilize both HiPIP and cytochrome *c<sup>8</sup>* in cyclic electron transfer depending on the growth conditions [42].

Soluble electron transfer proteins were previously characterized from strains DSM 109T and DSM 111 and found to be similar to one another but distinct from those of *Rcy. purpureus* [43]. These were described as cytochrome *c4* (minor component), cytochrome *c8*, cytochrome c‐552 (NirB), cytochrome c', and HiPIP. The latter appears to be absent in *Rcy. purpureus*. A multiple sequence alignment and phylogenetic tree of the HiPIP protein from Soluble electron transfer proteins were previously characterized from strains DSM 109<sup>T</sup> and DSM 111 and found to be similar to one another but distinct from those of *Rcy. purpureus* [43]. These were described as cytochrome*c<sup>4</sup>* (minor component), cytochrome *c8*, cytochrome c-552 (NirB), cytochrome c', and HiPIP. The latter appears to be absent in *Rcy. purpureus*. A multiple sequence alignment and phylogenetic tree of the HiPIP protein from all *Rhodocyclus tenuis* species (Figure 4) resulted in a phylogenetic relationship that is consistent with the whole genome and ANI comparisons described above. The HiPIP protein sequences from DSM 110, DSM 111, and DSM 112 clearly form a clade on the tree separate from the two sequences of *Rcy. tenuis* DSM 109<sup>T</sup> and IM230. *Rcy. tenuis* DSM 109<sup>T</sup> and *Rcy. gracilis* apparently utilize HiPIP as electron donor to the photosynthetic reaction center, but we have now shown that the HiPIP gene, as well as the soluble protein, is lacking in *Rcy. purpureus*, confirming the previous analysis and also suggesting that a

*Microorganisms* **2022**, *10*, x FOR PEER REVIEW 8 of 15

cytochrome, presumably *c8*, fills the role of mediator between the cytochrome *bc<sup>1</sup>* complex and the PufLMC reaction center. *Rcy. purpureus*, confirming the previous analysis and also suggesting that a cytochrome, presumably *c8*, fills the role of mediator between the cytochrome *bc1* complex and the PufLMC reaction center.

all *Rhodocyclus tenuis* species (Figure 4) resulted in a phylogenetic relationship that is con‐ sistent with the whole genome and ANI comparisons described above. The HiPIP protein sequences from DSM 110, DSM 111, and DSM 112 clearly form a clade on the tree separate from the two sequences of *Rcy. tenuis* DSM 109T and IM230. *Rcy. tenuis* DSM 109T and *Rcy. gracilis* apparently utilize HiPIP as electron donor to the photosynthetic reaction center, but we have now shown that the HiPIP gene, as well as the soluble protein, is lacking in

**Figure 4.** Phylogenetic tree of the HiPIP protein sequences obtained from the *Rhodocyclus* genomes. *Rubrivivax gelatinosus* DSM 149 HiPIP was used as an outgroup. **Figure 4.** Phylogenetic tree of the HiPIP protein sequences obtained from the *Rhodocyclus* genomes. *Rubrivivax gelatinosus* DSM 149 HiPIP was used as an outgroup.

#### *3.3. Nitrogen Metabolism 3.3. Nitrogen Metabolism*

All species of *Rhodocyclus* apparently produce large amounts of the denitrifying di‐ heme cytochrome NirB [43], which has been shown by the sequence of the protein from *Rcy. tenuis* DSM 109T and DSM 111 [44] and by the fact that all of the *Rhodocyclus* genome sequences in the current study contain the *nir*B gene. However, none of these species ap‐ parently have the corresponding denitrifying genes for nitrite, nitric, or nitrous oxide re‐ ductases, and NirB apparently assumes a different role. NirB was originally discovered in *Pseudomonas stutzeri* as part of the denitrification pathway, in which it forms a polycistronic mRNA along with the nitrite reductase, NirS (cytochrome cd1), and the membrane‐bound tetraheme cytochrome, NirT [45]. The implication is that NirT donates electrons to NirB, which in turn reacts with NirS to reduce nitrite to nitric oxide. The nor‐ mal electron donor to NirS in other pseudomonads is NirM, or C8 as it is also called, and it may be involved in *Ps. stutzeri* as well with NirB enhancing the interaction. Perhaps the role of NirB in *Rhodocyclus* species is to facilitate the interaction of C8 with the cytochrome *bc1* complex and PufLMC. This deserves further study. All species of *Rhodocyclus* apparently produce large amounts of the denitrifying diheme cytochrome NirB [43], which has been shown by the sequence of the protein from *Rcy. tenuis* DSM 109<sup>T</sup> and DSM 111 [44] and by the fact that all of the *Rhodocyclus* genome sequences in the current study contain the *nir*B gene. However, none of these species apparently have the corresponding denitrifying genes for nitrite, nitric, or nitrous oxide reductases, and NirB apparently assumes a different role. NirB was originally discovered in *Pseudomonas stutzeri* as part of the denitrification pathway, in which it forms a polycistronic mRNA along with the nitrite reductase, NirS (cytochrome cd1), and the membrane-bound tetraheme cytochrome, NirT [45]. The implication is that NirT donates electrons to NirB, which in turn reacts with NirS to reduce nitrite to nitric oxide. The normal electron donor to NirS in other pseudomonads is NirM, or C8 as it is also called, and it may be involved in *Ps. stutzeri* as well with NirB enhancing the interaction. Perhaps the role of NirB in *Rhodocyclus* species is to facilitate the interaction of C8 with the cytochrome *bc<sup>1</sup>* complex and PufLMC. This deserves further study.

Consistent with the earlier observations that *Rcy. purpureus* is incapable of fixing mo‐ lecular nitrogen, while *Rcy. tenuis* DSM 109T showed nitrogenase activity [46], we identi‐ fied a total of 16 nitrogenase‐related PGFams that are absent in *Rcy. purpureus* but present in all of the other *Rhodocyclus* genomes. These include the (Fe‐Fe) nitrogenase (alpha, beta, and delta chains), (Mo‐Fe) nitrogenase (alpha and beta chains), nitrogen reductase and maturation proteins, two 4Fe‐4S nitrogenase‐associated ferredoxins, a nitrogenase tran‐ scriptional regulator, and several Fe‐Mo cofactor assembly proteins. Consistent with the earlier observations that *Rcy. purpureus* is incapable of fixing molecular nitrogen, while *Rcy. tenuis* DSM 109<sup>T</sup> showed nitrogenase activity [46], we identified a total of 16 nitrogenase-related PGFams that are absent in *Rcy. purpureus* but present in all of the other *Rhodocyclus* genomes. These include the (Fe-Fe) nitrogenase (alpha, beta, and delta chains), (Mo-Fe) nitrogenase (alpha and beta chains), nitrogen reductase and maturation proteins, two 4Fe-4S nitrogenase-associated ferredoxins, a nitrogenase transcriptional regulator, and several Fe-Mo cofactor assembly proteins.

#### *3.4. Cobalamin Metabolism 3.4. Cobalamin Metabolism*

When comparing the different genomes, it was found that the 3 *Rhodocylus* strains DSM 110, DSM 111, and DSM 112 all contain at least 12 unique genes related to anaerobic cobalamin (vitamin B12) synthesis, which are all missing from the other 3 genomes. These genes code for cobalt–corrin metabolic enzymes and cobalt transporter subunits and are organized in a large gene cluster (Figure 5). The gene synteny of the cluster is conserved in all three strains. It has been known for decades that two pathways exist in nature for the de novo biosynthesis of vitamin B12. The pathways differ in the first parts, which in‐ volves the corrin synthesis, in which one pathway is anaerobic (as found in *Salmonella typhimurium* and *Bacillus megaterium*) and the other is oxygen‐dependent [47–50]. Figure 5 When comparing the different genomes, it was found that the 3 *Rhodocylus* strains DSM 110, DSM 111, and DSM 112 all contain at least 12 unique genes related to anaerobic cobalamin (vitamin B12) synthesis, which are all missing from the other 3 genomes. These genes code for cobalt–corrin metabolic enzymes and cobalt transporter subunits and are organized in a large gene cluster (Figure 5). The gene synteny of the cluster is conserved in all three strains. It has been known for decades that two pathways exist in nature for the de novo biosynthesis of vitamin B12. The pathways differ in the first parts, which involves the corrin synthesis, in which one pathway is anaerobic (as found in *Salmonella typhimurium* and *Bacillus megaterium*) and the other is oxygen-dependent [47–50]. Figure 5 includes the KEGG pathways and shows that all the gene products necessary for anaerobic corrin synthesis (cobalt-containing modified tetrapyrrole component of vit. B12) are present in the *Rhodocyclus* gene cluster. The ATP-dependent transport system encoded by the corrin biosynthetic operon in *S. typhimurium* (CbiMNQO), mediates transport of cobalt ions for the B<sup>12</sup> synthesis [47]. In addition, vitamin B<sup>12</sup> and other corrinoids are actively transported using the TonB-dependent outer membrane receptor BtuB in complex with the

*Microorganisms* **2022**, *10*, x FOR PEER REVIEW 9 of 15

ABC transport system BtuFCD [51]. The *Rhodocyclus* gene cluster also contains a unique outer membrane vitamin B<sup>12</sup> receptor BtuB and transporter BtuN (Figure 5). the ABC transport system BtuFCD [51]. The *Rhodocyclus* gene cluster also contains a unique outer membrane vitamin B12 receptor BtuB and transporter BtuN (Figure 5).

includes the KEGG pathways and shows that all the gene products necessary for anaero‐ bic corrin synthesis (cobalt‐containing modified tetrapyrrole component of vit. B12) are present in the *Rhodocyclus* gene cluster. The ATP‐dependent transport system encoded by the corrin biosynthetic operon in *S. typhimurium* (CbiMNQO), mediates transport of cobalt ions for the B12 synthesis [47]. In addition, vitamin B12 and other corrinoids are actively transported using the TonB‐dependent outer membrane receptor BtuB in complex with

**Figure 5.** (**A**). Overview of the anaerobic cobalamin genetic pathway that was found in *Rhodocyclus* strains DSM 110, DSM 111, and DSM 112. Synteny plots were generated in PATRIC [22] and genes are colored based on enzymatic family. (**B**). Overview of the anaerobic and aerobic cobalamin met‐ abolic pathways. Enzyme numbering is the same as corresponding gene numbers in (**A**). **Figure 5.** (**A**). Overview of the anaerobic cobalamin genetic pathway that was found in *Rhodocyclus* strains DSM 110, DSM 111, and DSM 112. Synteny plots were generated in PATRIC [22] and genes are colored based on enzymatic family. (**B**). Overview of the anaerobic and aerobic cobalamin metabolic pathways. Enzyme numbering is the same as corresponding gene numbers in (**A**).

In addition to the PATRIC PGFam comparison, we also checked for the presence of each of these cobalamin metabolic genes individually with BLAST and only found a trun‐ cated version of one of the enzymes (gene 6 in Figure 5) in the *Rcy. purpureus* and *Rcy. tenuis* DSM 109T and IM230 genomes. This could be an indication that the anaerobic co‐ balamin biosynthetic pathway was lost during evolution, although further analysis that includes more evolutionary divergent species would be needed to confirm this. In addition to the PATRIC PGFam comparison, we also checked for the presence of each of these cobalamin metabolic genes individually with BLAST and only found a truncated version of one of the enzymes (gene 6 in Figure 5) in the *Rcy. purpureus* and *Rcy. tenuis* DSM 109<sup>T</sup> and IM230 genomes. This could be an indication that the anaerobic cobalamin biosynthetic pathway was lost during evolution, although further analysis that includes more evolutionary divergent species would be needed to confirm this.

The lack of this anaerobic cobalamin metabolism pathway explains why Pfennig originally described *Rcy. purpureus* as a vitamin B12-requiring member of the *Rhodospirillaceae* [9]. According to this genome comparison, strains DSM 109<sup>T</sup> and IM230 would also need cobalamin as a growth factor as they grow anaerobically; however, the need for this has not been described. Either way, the presence of this pathway exclusively in the three

*Rcy.* strains DSM 110, DSM 111, and DSM 112 further distinguishes them genetically from the other *Rhodocyclus* species. the other *Rhodocyclus* species. *3.5. Chemotaxis and Motility*

The lack of this anaerobic cobalamin metabolism pathway explains why Pfennig originally described *Rcy. purpureus* as a vitamin B12‐requiring member of the *Rhodospiril‐ laceae* [9]. According to this genome comparison, strains DSM 109T and IM230 would also need cobalamin as a growth factor as they grow anaerobically; however, the need for this has not been described. Either way, the presence of this pathway exclusively in the three *Rcy.* strains DSM 110, DSM 111, and DSM 112 further distinguishes them genetically from

*Microorganisms* **2022**, *10*, x FOR PEER REVIEW 10 of 15

#### *3.5. Chemotaxis and Motility Rcy. purpureus* was found to be non‐motile [9], while *Rcy. tenuis* was highly motile

*Rcy. purpureus* was found to be non-motile [9], while *Rcy. tenuis* was highly motile [10]. We found 31 flagella related PGFams that were absent from the *Rcy. purpureus* genome but present in all the other *Rhodocyclus* strains. These included all of the biosynthetic, structural and regulatory proteins for flagella assembly. In addition, there are at least 12 unique chemotaxis related PGFams present, including, 2 CheWs, CheA, CheB, CheR, CheZ, CheY, CheD, CheV and MCPs. This chemotaxis gene cluster is directly upstream of the flagella genes in all of these genomes. The presence of these genes confirms that all of the *Rhodocyclus* species, expect *Rcy. purpureus*, are motile, and are expected to perform chemotaxis. [10]. We found 31 flagella related PGFams that were absent from the *Rcy. purpureus* ge‐ nome but present in all the other *Rhodocyclus* strains. These included all of the biosyn‐ thetic, structural and regulatory proteins for flagella assembly. In addition, there are at least 12 unique chemotaxis related PGFams present, including, 2 CheWs, CheA, CheB, CheR, CheZ, CheY, CheD, CheV and MCPs. This chemotaxis gene cluster is directly up‐ stream of the flagella genes in all of these genomes. The presence of these genes confirms that all of the *Rhodocyclus* species, expect *Rcy. purpureus*, are motile, and are expected to perform chemotaxis.

#### *3.6. RuBisCo 3.6. RuBisCo*

Genomes of *Rcy. tenuis* and the new species *Rhodocyclus gracilis* have two forms of RuBisCo. One gene encodes a single 463 aa. protein (PGFam\_00048972) and is included in a gene cluster with *cbb*Y-*rbc*-*cbb*R-fructose-1,6-bisphosphatase- phosphoribulokinase. This cluster is also present in *Rcy. purpureus*. The second system of RuBisCo consists of a large (474 aa.; PGFam\_00048973) and small subunit (118 aa.; PGFam\_00048975), included in a gene cluster of *cbb*R-*rbc*L-*rbc*S-*cbb*Q-*cbb*O, which is absent from *Rcy. purpureus* DSM 168<sup>T</sup> . The presence of multiple RuBisCo forms is not uncommon, and the sequences and structural gene organization indicate that the *Rhodocyclus* single rubisco gene belongs to Form II, while the rbcL and rbcS resemble Form I RuBisCo found in other proteobacteria [52]. Genomes of *Rcy. tenuis* and the new species *Rhodocyclus gracilis* have two forms of RuBisCo. One gene encodes a single 463 aa. protein (PGFam\_00048972) and is included in a gene cluster with *cbb*Y‐*rbc*‐*cbb*R‐fructose‐1,6‐bisphosphatase‐ phosphoribulokinase. This cluster is also present in *Rcy. purpureus*. The second system of RuBisCo consists of a large (474 aa.; PGFam\_00048973) and small subunit (118 aa.; PGFam\_00048975), included in a gene cluster of *cbb*R‐*rbc*L‐*rbc*S‐*cbb*Q‐*cbb*O, which is absent from *Rcy. purpureus* DSM 168T. The presence of multiple RuBisCo forms is not uncommon, and the sequences and struc‐ tural gene organization indicate that the *Rhodocyclus* single rubisco gene belongs to Form II, while the rbcL and rbcS resemble Form I RuBisCo found in other proteobacteria [52]. The translated protein sequence of the single RuBisCo gene (PGFam\_00048972) that

The translated protein sequence of the single RuBisCo gene (PGFam\_00048972) that is present in all *Rhodocyclus* genomes, was used for a phylogenetic comparison (Figure 6). The RuBisCo protein sequence of *Rcy. gracilis* strains is closer related to *Rcy. purpureus* DSM 168<sup>T</sup> , with 94.1% identity (98.3% similarity) than to the *Rcy. tenuis* DSM 109<sup>T</sup> with 91.6% identity (97.8% similarity). As expected, the *Rcy. tenuis* IM 230 RuBisCo shows a high protein identity to the one from *Rcy. tenuis* DSM 109<sup>T</sup> (99.8% identity, 100% similarity), while the *Rcy. tenuis* DSM 109<sup>T</sup> and *Rcy. purpureus* sequences were less similar (90.8% identity, 96.3% similarity). The phylogenetic tree based on the RuBisCo multiple protein sequence alignment (Figure 6) shows a similar topology as the whole genome, 16SrRNA and cytochrome-based comparisons, and further supports the proposed species differentiation. is present in all *Rhodocyclus* genomes, was used for a phylogenetic comparison (Figure 6). The RuBisCo protein sequence of *Rcy. gracilis* strains is closer related to *Rcy. purpureus* DSM 168T, with 94.1% identity (98.3% similarity) than to the *Rcy. tenuis* DSM 109T with 91.6% identity (97.8% similarity). As expected, the *Rcy. tenuis* IM 230 RuBisCo shows a high protein identity to the one from *Rcy. tenuis* DSM 109T (99.8% identity, 100% similar‐ ity), while the *Rcy. tenuis* DSM 109T and *Rcy. purpureus* sequences were less similar (90.8% identity, 96.3% similarity). The phylogenetic tree based on the RuBisCo multiple protein sequence alignment (Figure 6) shows a similar topology as the whole genome, 16SrRNA and cytochrome‐based comparisons, and further supports the proposed species differen‐ tiation.

**Figure 6.** Phylogenetic tree based on multiple sequence alignment of the RuBisCo protein sequences, obtained from the *Rhodocyclus* genomes (PGFam\_00048972). *Rhodospirillum rubrum* ATCC 11170T RuBisCo was added as an outgroup. **Figure 6.** Phylogenetic tree based on multiple sequence alignment of the RuBisCo protein sequences, obtained from the *Rhodocyclus* genomes (PGFam\_00048972). *Rhodospirillum rubrum* ATCC 11170<sup>T</sup> RuBisCo was added as an outgroup.

#### *3.7. Taxonomic Considerations*

A limited number of strains of *Rhodocyclus* isolates have been studied and compared in several studies during the past decades. Strains were isolated from various freshwater lakes in Germany by Hanno Biebl and Norbert Pfennig and their properties compared [12]. Although two color variants were recognized, a brownish-red colored one, including the type strain of *Rcy. tenuis* DSM 109<sup>T</sup> , and a red-colored or purple-violet one, including

strains now assigned to *Rcy. gracilis*, no clear differentiation of groups of strains was made [12].

The different colored cultures showed differences in absorption spectra in the carotenoid region (450–550 nm) [12] and also revealed a different carotenoid composition [13]. The type strain of *Rcy. tenuis* DSM 109<sup>T</sup> has carotenoids of the spirilloxanthin series with major portions of lycopene, rhodopin and anhydro-rhodovibrin and spirilloxanthin as the final product of this pathway. The purple-violet-colored DSM 110, DSM 111, and DSM 112 contain significant amounts of rhodopinal, rhodopinol, and lycopenal in addition to rhodopin and lycopene but lack spirilloxanthin, rhodovibrin, and anhydro-rhodovibrin [13], a property they share with *Rcy. purpureus* DSM 168<sup>T</sup> [13].

Studies on the lipopolysaccharides of eight isolates assigned to *Rhodocyclus tenuis* revealed a common pattern of sugars characterized by the presence of glycerol-mannoheptose, glucose, arabinose, 2-keto-3deoxyoctonoate and glucosamine in all studied strains [14]. While the presence of D-galactosamine was found only in those strains assigned now to *Rcy. gracilis* (DSM 110, DSM 111, DSM 112), the type strain of *Rcy. tenuis* DSM 109<sup>T</sup> (2761) lacked D-galactosamine and in turn had quinovosamine as a strain-specific sugar [14].

Based on the unique genomic and genetic features of the *Rhodocyclus* strains described above, it is clear that strains DSM 110, DSM 111, and DSM 112 belong into a single species which is separate from *Rcy. tenuis* and *Rcy. purpureus*. Of the strains compared in the present study, only strains DSM 109<sup>T</sup> and IM 203 remain as strains of *Rcy. tenuis*. For the strains recognized as a new species, the name *Rhodocyclus gracilis* sp. nov. is proposed.

The characteristic properties that distinguish strains of *Rcy. gracilis* from *Rcy. tenuis* are the utilization of ethanol [12] and the presence of carotenoids of the rhodopinal series [13], which coincides with the purple-violet color of *Rcy. gracilis*. The pH optimum of *Rcy. gracilis* is slightly lower, at pH 6.1–6.4, compared to the pH 6.7 of *Rcy. tenuis* [12]. The genomes are different in size, 2.93–2.98 Mb for *Rcy. gracilis* and 3.65–3.85 Mb for *Rcy. tenuis* (Table 1), and the G + C content is 64.5 mol% in *Rcy. gracilis* and 64.7 mol% in *Rcy. tenuis* (Table 1).

Due to the lack of a clear differentiation of groups of strains of *Rcy. tenuis* in previous studies, the properties of both *Rcy. gracilis* and *Rcy. tenuis* have been listed as the properties of *Rcy. tenuis* in the literature [12–14,53]. Consequently, the species description of *Rcy. tenuis* should be emended accordingly.

Characteristic for all *Rhodocyclus* species is the presence of phosphatidyl glycerol (PG), phosphatidyl ethanolamine (PE), diphosphatidyl glycerol (CL), and an ornithine lipid as major polar lipids; the dominance of C-16 fatty acids (33–36% C-16:0 and 43–50% C-16:1) and minor amounts of the C-18 fatty acids (<0.5% C-18:0 and 14–18% C-18:1); as well as ubiquinone Q-8 and menaquinone MK-8 as major quinone components [16].

#### 3.7.1. Description of *Rhodocyclus gracilis* sp. nov

#### *Rhodocyclus gracilis*. gra'ci.lis M.L. neut. adj. *gracilis* Slender

Cells are weakly curved, 0.3–0.5 µm wide and 1.5–5 µm long, motile by polar flagella and divide by binary fission. Cultures grown anaerobically in the dark are red to purpleviolet in color and have absorption maxima at 377–378, 469, 495–500, 529–533, 590–592, 798– 801, and 856–858 nm. Photosynthetic pigments are bacteriochlorophyll-a and carotenoids of the rhodopinal series. Internal photosynthetic membranes are present as small finger-like intrusions of the cytoplasmic membrane.

Growth occurs preferably under phototrophic conditions anaerobically in the light. Under these conditions, organic carbon compounds are used as carbon and energy sources. The sources utilized are acetate, propionate, butyrate, valerate, caproate, lactate, pyruvate, fumarate, malate, succinate, and ethanol. Some strains may use pelargonate and yeast extract. Not utilized are tartrate, citrate, benzoate, methanol, glycerol, glucose, fructose, mannitol, alanine, glutamate, aspartate, arginine, thiosulfate, and sulfide. Sulfide is growth inhibitory at 2 mM. Chemotrophic growth under aerobic dark conditions is possible. Aerobically grown cells are colorless, and the aerobic Mg-protoporphyrin IX monomethyl

ester oxidative cyclase is absent. Photolithotrophic growth with hydrogen as an electron source may be possible. Ammonium chloride and dinitrogen are used as nitrogen sources. Growth factors may be required. Mesophilic freshwater bacterium with optimum growth at 30 ◦C and pH 6.1–6.4 (pH range 4.9–8.2). Habitats are freshwater lakes and peat bogs. The habitat of the type strain DSM 110 is a dystrophic pond in the Black Forest (Germany).

The type strain has a G + C content of the DNA of 64.5 mol% (genome analysis) and a genome size of 2.93 Mb. The type strain is deposited with the Deutsche Sammlung von Mikroorganismen und Zellkulturen as DSM 110<sup>T</sup> (Pfennig 3760) and the Japan Collection of Microorganisms Riken BRC.

Gene bank accession number of the 16S rDNA sequence of the type strain OM179767 and of the genome WIXJ00000000.

3.7.2. Emended Description of *Rhodocyclus tenuis* Imhoff, Trüper and Pfennig 1984, 341.VP (*Rhodospirillum tenue* Pfennig 1969, 619.AL)

te'nu.is. L. masc. adj. *tenuis* Slender, Thin

Cells are weakly curved spirals, highly motile by polar flagella. They are 0.3–0.5 µm wide and 1.5–6.0 µm long, sometimes even longer. One complete turn of a spiral is about 0.8–1.0 µm wide and 3 µm long. Photosynthetically grown cells are brownish-red and have absorption maxima at 378–380, 465, 492–495, 528, 592–594, 799–801, and 868–871 nm. Photosynthetic pigments are bacteriochlorophyll-a esterified with phytol and carotenoids of the spirilloxanthin series.

Growth occurs preferably under anoxic conditions in the light with organic carbon compounds as carbon and electron sources. Photolithotrophic growth with molecular hydrogen is possible. Chemotrophic growth is possible under microoxic to oxic conditions in the dark. Aerobically grown cells are colorless or pale red. Under phototrophic growth conditions organic carbon compounds are used as carbon and energy sources. The sources utilized are acetate, butyrate, valerate, caproate, lactate, pyruvate, fumarate, malate, and succinate. Pelargonate and propionate may be used by some strains. Not utilized are formate, ethanol, tartrate, citrate, benzoate, cyclohexane carboxylate, methanol, glycerol, glucose, fructose, mannitol, alanine, glutamate, aspartate, arginine, thiosulfate, and sulfide. Sulfide is growth-inhibitory at 2 mM. The nitrogen sources utilized are aspartate, glutamate, glutamine, ammonia, and dinitrogen and also casamino acids, peptone, yeast extract, alanine, arginine, lysine, methionine, serine, threonine, and urea. Sulfate, glutathione, cysteine, thiosulfate, and also sulfite and sulfide at low concentrations can serve as assimilatory sulfur sources. Growth factors are not required. Growth is stimulated, however, in the presence of complex organic nutrients or yeast extract and some strains may need vitamin B12.

Mesophilic freshwater bacterium with optimum growth at 30 ◦C and pH 6.7–7.4.

Habitat: freshwater ponds, sewage ditches.

The type strain has a G + C of the DNA of 64.7 mol% (genome analysis) and a genome size of 3.85 Mb.

Type strain: ATCC 25093, DSM 109 (Pfennig: 2761, Grünenplan).

Gene bank accession number of the 16S rDNA sequence of the type strain: D16208. Gene bank accession number of the genome sequence of the type strain: SSSP00000000.

**Author Contributions:** Conceptualization, J.A.K., J.F.I. and T.E.M.; methodology, J.A.K., F.A.A., and T.E.M.; software, J.A.K., F.A.A., S.K., S.C.N. and T.E.M.; validation, J.A.K., F.A.A., J.F.I., S.K., S.C.N. and T.E.M.; formal analysis, J.A.K., F.A.A., J.F.I., S.K., S.C.N. and T.E.M.; resources, J.A.K., J.F.I. and T.E.M.; data curation, J.A.K., F.A.A., J.F.I., S.K., S.C.N. and T.E.M.; writing—original draft preparation, J.A.K., F.A.A., J.F.I. and T.E.M.; writing—review and editing, J.A.K., J.F.I. and T.E.M.; visualization, J.A.K. and F.A.A.; project administration, J.A.K., J.F.I. and T.E.M.; funding acquisition, J.A.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was sponsored by the Wilson Enhancement Fund for Applied Research in Science at Bellevue University.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** This Whole Genome Shotgun project has been deposited at DDBJ/ENA/ GenBank under the accession numbers provided in Table 1.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article*
