*2.2. Sequences*

Sequences of PufL, PufM, PufH, PufC, BchXYZ were retrieved from the annotated genomes. Genome sequences were annotated using the rapid annotations using subsystems technology (RAST) [15,16]. All sequences were deposited in the EMBL database. Accession numbers, together with species and strain designations, as well as the corresponding higher taxonomic ranks, are included in Supplementary Table S1.

## *2.3. Phylogenetic Analyses*

Multiple sequence alignments (MSAs) were produced with MAFFT v7.313 [17,18] from all sequences and were visually inspected for consistency. MAFFT was run with parameters '- globalpair - maxiterate 1000-. Alignment positions with >25% gaps were trimmed from MSAs. Maximum likelihood (ML) phylogenetic trees were calculated from MSAs with IQ-TREE v1.6.1 [19] using the best substitution models inferred from MSAs. For trees calculated from combined alignments ('bchXYZ' and 'bchXYZpufHLM'), substitution models were used as so-called partition models [20]. Ultrafast bootstrap approximation (UFBoot) [21] was used to provide branch support values with 1000 replicates based on the same substitution models as the original ML tree. Branch support values were assigned onto the original ML tree as the number of times each branch in the original tree occurred in the set of bootstrap replicates (IQ-TREE option '-sup').

Phylogenetic trees were midpoint-rooted and formatted using functionality from R packages ape v5.0.1 [22], phangorn v2.3.2 [23], and phytools v0.6.45 [24]. Bootstrap values within a range of 80–100% were visualized as filled circles. The circle area is a linear function of the respective bootstrap value. The scale bar beneath a tree indicates the number of substitutions per alignment site. A co-phylogenetic plot was produced to facilitate the comparison of selected phylogenies. Nodes of compared trees were rotated to optimize tip matching.

#### **3. Results and Discussion**

#### *3.1. Strain and Sequence Selection*

Representatives of phototrophic *Proteobacteria* (10 orders, 21 families, 86 genera, 138 species, 159 strains + five unclassified strains) together with five representatives of *Chloroflexi* (one order, three families, three genera, five species) and six selected *Chlorobi* (one order, one family, four genera, six species), as well as *Gemmatimonas phototrophica*, *Chloracidobacterium acidophilum,* and *Heliobacterium modesticaldum* were included in the phylogenetic analyses of this study.

Depending on the availability of gene and genomic information, primarily sequence information from the type and reference strains was considered. In order to avoid any incongruity due to strain-dependent sequence variation, sequences from identical strains were used for all phylogenetic trees. All species and strain numbers are presented in Supplementary Table S1.

#### *3.2. Phylogeny According to 16S rRNA Gene Sequences*

As the 16S rRNA gene is established as a phylogenetic reference since the pioneering work of Carl Woese [25], we included the phylogenetic tree of this gene showing the relationship of all strains selected for the present study (RNA tree, Figure 1) and later compared this phylogenetic relationship with that of key proteins of photosynthesis (Figure 4). Clearly separated and distinct major groups with the deepest branching points in the tree were represented by *Chlorobi*, *Chloroflexi*, as well as *Heliobacterium modesticaldum* (representative of *Firmicutes* phylum), *Chloracidobacterium thermophilum* (representative of *Acidobacteria* phylum), and *Gemmatimonas phototrophica* (representative of *Gemmatimonadetes* phylum) (Figure 1). Quite remarkable was the isolated position of *Gemmatimonas*, which encodes a typical proteobacterial photosynthetic apparatus [26,27].

The *Proteobacteria* formed two distinct major branches with all *Alphaproteobacteria* in one branch and the *Gammaproteobacteria* and *Betaproteobacteria* in another branch. In the *Gammaproteobacteria* branch, distinct lineages were represented by *Chromatiaceae*, the *Ectothiorhodospira* group, the *Halorhodospira* group, the *Cellvibrionales* (aerobic anoxygenic phototrophic *Gammaproteobacteria*), and the *Betaproteobacteria*.

A much more complex situation existed within the *Alphaproteobacteria,* with a number of small groups with larger phylogenetic distance. The *Rhodobacterales* and also core groups of *Rhodospirillales*, *Rhizobiales,* and *Sphingomonadales* formed well-supported branches, which were, however, poorly resolved in their relationship to each other. Supported branches were formed by the members of the following genera:


**Figure 1.** Phylogenetic tree (RNA tree) of phototrophic bacteria according to 16S rRNA gene sequences.

Most remarkable were the isolated positions of representatives of *Fulvimarina*, *Hoeflea*, *Labrenzia*, *Rhodothalassium,* and *Afifella-Rhodobium.* Though distant to other phototrophic bacteria, *Brevundimonas* clearly was linked to the *Sphingomonadales* branch. In addition, *Rhodovibrio* species appeared as clear outsiders and formed the most deeply branching lineage within the *Alphaproteobacteria*. In addition, several small groups were formed by single species or a few species only. These included species of *Blastochloris*, *Rhodoplanes*, *Rhodoblastus*, *Methylocella*, *Prosthecomicrobium,* and *Rhodomicrobium* (Figure 1).

It should be emphasized that *Roseospirillum parvum* was associated with the *Rhodospirillaceae* and, in particular, with the *Rhodospirillum*/*Pararhodospirillum* group as also *Caenispirillum* and *Rhodospira trueperi* do (Figure 1), supporting the current taxonomic classification [28].

**Figure 2.** Phylogenetic tree of phototrophic bacteria, according to BchXYZ sequences.

#### *3.3. Phylogeny of Photosynthesis*

In order to evaluate the phylogeny of the photosynthetic apparatus, sequences of essential proteins for photosynthesis were analyzed. These included the bacteriochlorophyllide reductase BchXYZ and the photosynthetic reaction center proteins PufHLM and PufC. While the phylogenetic tree of BchXYZ (Figure 2) gave an overview of all considered strains and included all of the phototrophic green bacteria, the tree with combined sequences of PufHLM-BchXYZ (Figure 3) covered all phototrophic purple bacteria. PufC sequences were not considered in these trees because this component was absent from a number of representative species. All sequences and their accession numbers are presented in Supplementary Table S1.

#### 3.3.1. Phylogeny according to BchXYZ Sequences

The phylogeny of BchXYZ allows the widest view on the phylogeny of photosynthesis in phototrophic bacteria, including PS-I and PS-II bacteria. The chlorophyllide reductase BchXYZ catalyzes the first step in bacteriochlorophyll biosynthesis that di fferentiates this pathway from the biosynthesis of chlorophyll. It is present in all phototrophic bacteria producing di fferent forms of bacteriochlorophyll [10].

The deepest and likewise most ancient roots according to BchXYZ sequences (Figure 2) were found in the phototrophic green bacteria that employ a type-I photosystem, the *Chlorobi*, *Heliobacterium modesticaldum* and relatives, and *Chloracidobacterium thermophilum,* as well as in *Chloroflexi* that employ a type-II photosystem (like all *Proteobacteria*). The large sequence di fferences to the phototrophic purple bacteria pointed out that bacteriochlorophyll biosynthesis had evolved in ancestors of green bacteria much earlier as compared to phototrophic purple bacteria. This relationship quite well correlated to the phylogeny of the 16S rRNA gene (RNA tree) (Figure 1), with the exception of *Gemmatimonas phototrophica*, which, according to BchXYZ, was distantly associated with the *Betaproteobacteria*, specifically the *Burkholderiales* with *Rubrivivax* and *Rhodoferax* as representative genera. The phylogeny of photosynthesis in *Proteobacteria* was discussed on the basis of more comprehensive information of the BchXYZ-PufHLM sequences below (Figure 3).

**Figure 3.** Phylogenetic tree (PS tree) of phototrophic bacteria, according to BchXYZ-PufHLM sequences.

#### 3.3.2. Phylogeny of BchXYZ-PufHLM and Comparison with 16S rRNA Phylogeny

The combined sequence information of the key proteins of the photosynthetic reaction center in photosystem-II bacteria (PufHLM) and of the bacteriochlorophyll biosynthesis with the subunits of the chlorophyllide reductase (BchXYZ) gave a solid basis (alignment length, 2458 aa) to trace back the phylogeny of photosynthesis within the phototrophic purple bacteria (Figure 3). The consideration of PufHLM excluded the *Chloroflexi* (they lack PufH) in this consideration and restricted the view to *Proteobacteria* and *Gemmatimonas*. A direct comparison of the comprehensive phylogeny of anoxygenic photosynthesis, including sequences of BchXYZ-PufHLM (PS tree), with the phylogenetic relations according to 16S rRNA gene sequences (RNA tree) enlightened the evolution of photosynthesis as compared to that of the protein-producing machinery (Figure 4).

#### *Gammaproteobacteria (Chromatiales and Cellvibrionales)*

The phototrophic *Gammaproteobacteria* represented a well-established major phylogenetic branch with four major sub-branches, which were well supported within both PS tree and RNA tree. The sub-branches included


It was remarkable that the species with bacteriochlorophyll-b, according to the PS tree, formed different deeply rooted lineages associated with the corresponding bacteriochlorophyll-a containing relatives, *Hlr. abdelmalekii* and *Hlr. halochloris* were associated with the *Halorhodospira* branch, *Thiorhodococcus* and *Thioflavicoccus* species with the *Chromatiaceae*, and *Rhodospira trueperi* and *Blastochloris viridis* with the *Rhodospirillaceae,* specifically with the *Rhodospirillum* group though with a low significance (Figure 3). The incorporation of all bacteriochlorophyll-b-containing bacteria within one common cluster is restricted to the phylogeny of the reaction center proteins PufLM [3]. This has been previously explained by the congruen<sup>t</sup> evolution of the reaction center proteins with respect to the specific binding requirements of the bacteriochlorophyll-b molecule [3] and implicates the independent evolution of the photosystems with bacteriochlorophyll-b in the different phylogenetic lineages.

#### *Betaproteobacteria* (*Burkholderiales* and *Rhodocyclales*)

One of the most obvious differences between PS and RNA trees was in the position of the *Betaproteobacteria*. In the RNA tree, *Rhodocyclales* and the *Burkholderiales* formed two related lineages of a major branch within the frame of the *Gammaproteobacteria* (Figure 1). In the PS tree, both groups formed clearly separated clusters, which were associated with different branches of the *Alphaproteobacteria* (Figure 3). The *Burkholderiales* formed a deep and not safely rooted branch, including separate lineages of *Rubrivivax, Ideonella*/*Roseateles, Rhodoferax*/*Limnohabitans*, *Polynucleobacter,* and *Methyloversatilis.* The deep roots identify the photosynthesis of these bacteria as very ancient and, despite the poorly supported branches, could indicate a possible acquisition of photosynthesis by gene transfer from an early phototrophic alphaproteobacterium, as supposed earlier [6,29] (Igarashi et al., 2001; Nagashima and Nagashima, 2013). The whole group was also visible in the RNA tree but associated with the *Gammaproteobacteria*. In the PS tree, *Rhodocyclus* was linked to *Phaeospirillum* and the *Rhodospirillales* (Figures 3 and 4), contrasting its link to the *Burkholderiales* in the RNA tree (Figure 1). This change might be indicative of a single event of a transfer of the photosynthesis genes from an ancient alphaproteobacterium within the *Rhodospirillales* frame to a *Rhodocyclus* ancestor.

#### *Microorg* **2019**, *7*, 576
