**1. Introduction**

Heliobacteria comprise a unique group of strictly anaerobic, anoxygenic phototrophs that have been isolated from a wide diversity of soil and aquatic habitats [1–4]. Unlike all other phototrophic bacteria, heliobacteria use bacteriochlorophyll (Bchl) *g* as the chief chlorophyll pigment for phototrophic growth [5], but despite their ability to use light as an energy source, heliobacteria are apparently incapable of autotrophic growth and, thus, are obligate heterotrophs [4,6]. Heliobacteria are the only phototrophs of the large bacterial phylum *Firmicutes* [4,7,8], and although they typically stain Gram-negatively, thin sections of cells of heliobacteria exhibit a Gram-positive cell wall morphology [9,10]. In addition to these distinctive properties, cells of heliobacteria are able to differentiate into heat-resistant endospores [4,11], and some heliobacteria have also demonstrated the ability to reduce toxic metals, such as Hg<sup>2</sup>+, and therefore may be useful for applications in bioremediation [12,13].

Species of *Heliobacteriaceae* can be divided into two physiological groups—neutrophiles and alkaliphiles—that track closely with their phylogeny [7] (Figure 1). Included in the neutrophilic clade, the moderate thermophile *Hbt. modesticaldum* was the first heliobacterium to have its genome sequenced and, with its simple phototrophic machinery consisting of a type I reaction center (RC) and no peripheral antenna photocomplex, has been a model organism for studies of photosynthesis and related photochemistry [6,14]. Like other neutrophilic heliobacteria, *Hbt. modesticaldum* exhibits both phototrophic growth in the light and chemotrophic growth in the dark [3,15–17].

**Figure 1.** Phylogenetic (16S rRNA) tree of *Heliorestis convoluta* and related *Firmicutes*. Heliobacteria, the only phototrophic *Firmicutes*, are divided into alkaliphilic and neutrophilic species. *Heliobacterium modesticaldum* (boxed) is the model organism for physiological and biochemical studies of the heliobacteria; *Hrs. convoluta* (boxed) is the first alkaliphilic heliobacterium to have a described genome. Note that the branching pattern shown here suggests a possible alkaliphilic origin to the heliobacteria, as previously discussed by Sattley and Swingley [7]. The weighted neighbor-joining method [18] and Jukes-Cantor corrected distance model were used for tree construction. Nodes represent bootstrap values (≥50%) based on 100 replicates, and *Escherichia coli* was used to root the tree. GenBank accession numbers for each sequence used in the analysis are shown in parentheses, adapted from Sattley and Swingley [7], *Adv. Bot. Res.* **2013**, *66*, 67–97, Copyright 2013 Elsevier Ltd.

Species of alkaliphilic heliobacteria grow optimally between pH 8–9.5 and, unlike neutrophilic heliobacteria, are obligate photoheterotrophs, using light and organic compounds for growth but incapable of chemotrophic growth in darkness [19–22]. Consistent with other alkaliphilic heliobacteria originating from the soils and waters of soda lakes [19,20,22], *Hrs. convoluta* str. HH<sup>T</sup> was isolated from the shore of the alkaline (pH 10) Lake El Hamra (Figure 2A), located in the Wadi El Natroun region of northern Egypt [21]. In the past, the saline lakes of the Wadi El Natroun have also been a fertile source of alkaliphilic purple bacteria, yielding many extremely alkaliphilic (and in some cases also extremely halophilic) species, including in particular, new species of the genus *Halorhodospira* [23–25]. However, *Hrs. convoluta* is the first heliobacterium to originate from these unusual lakes. Experimental work with *Hrs. convoluta* revealed motile cells having an unusual tightly coiled morphology (Figure 2B) and displaying a mesophilic (optimal growth at 33 ◦C) and alkaliphilic (optimal growth at pH 8.5–9) physiology [21].

**Figure 2.** Habitat and cells of *Heliorestis convoluta* strain HHT. (**A**) Red bloom of alkaliphilic *Bacteria* and *Archaea* on the shore of Lake El Hamra, Wadi Natroun, Egypt. M.T.M. sampled this bloom in May 2001, and enrichments for heliobacteria yielded *Hrs. convoluta*. The bloom is about 2 m in diameter.; (**B**) Scanning electron micrograph of cells of *Hrs. convoluta* strain HHT. A cell of *Hrs. convoluta* is about 0.5 μm in diameter and coils are of variable length. Scale bar = 1 μm.

To complement the analysis of the genome sequence of *Hbt. modesticaldum* [6,26], we present here a comparative analysis of the genome of *Hrs. convoluta*. Although a number of highly conserved genes encoding proteins that coordinate key processes in the cell (e.g., phototrophy and central carbon metabolism) are shared between these species, a close comparison of the two heliobacterial genomes revealed several genes encoding functions in carbon metabolism, biotin biosynthesis, nitrogen and sulfur assimilation, and carotenoid biosynthesis that are not held in common by these heliobacteria, which inhabit vastly different extreme environments. In addition, a comparative analysis of selected cytochrome and ATP synthase proteins in *Hrs. convoluta* revealed adaptations that likely facilitate its alkaliphilic lifestyle. The availability of a second heliobacterial genome, as well as the recent development of a genetic system in *Hbt. modesticaldum* [14], paves the way for increasing our understanding of the unique metabolism and physiology of heliobacteria.

#### **2. Materials and Methods**

Total genomic DNA from *Hrs. convoluta* str. HH<sup>T</sup> (ATCC BAA-1281 and DSMZ 19787) [21] was isolated through proteinase K treatment and subsequent phenol extraction. Complete genome sequencing was performed using a random shotgun approach, and reads were assembled using Velvet v. 2010 [27]. Pyrosequencing on a Roche-454 GS20 sequencer (Hoffman-La Roche AG, Basel, Switzerland) provided 14-fold genome coverage, and an additional 35-fold coverage was generated by the Illumina GAIIx platform.

Annotation of the *Hrs. convoluta* genome was performed in accordance with the Prokaryotic Annotation Pipeline of the University of Maryland School of Medicine's Institute for Genome Sciences [28]. This pipeline employs Glimmer for gene identification and then searches the protein sequences with BLAST-extend-repraze (BER; a combination of BLAST and Smith–Waterman algorithms) to generate pairwise alignments, Hidden Markov Model (HMM), transmembrane (Tm) HMM, and SignalP predictions. An automated process employing the Pfunc evidence hierarchy is used to assign functional annotations. Manual verification of automated annotations was facilitated through the online tool Manatee [29] in conjunction with online databases including the Kyoto Encyclopedia of Genes and Genomes (KEGG), the Braunschweig Enzyme Database (BRENDA), MetaCyc, and Uniprot. The National Center for Biotechnology Information (NCBI) database was accessed to retrieve gene and protein sequences from related species for comparative analyses with corresponding genes in the genome of *Hrs. convoluta*.

The phylogenetic tree was generated as described in the legend to Figure 1. Genome statistics were compiled using the Pfam database v. 30.0 [30], the SignalP database v. 4.1 [31], the TMHMM database v. 2.0 [32], and CRISPRFinder v. 2.0 [33]. This complete genome sequence project has been deposited at DDBJ/EMBL/GenBank under accession number CP045875.

#### **3. Results and Discussion**

## *3.1. Genome Properties*

The 3,218,981 base-pair (bp) genome of *Heliorestis convoluta* str. HH<sup>T</sup> is organized into a single circular chromosome with no plasmids (Table 1). The 43.1% GC content of *Hrs. convoluta* is among the lowest of all heliobacteria (41%–57.7%) and is typical of alkaliphilic species of this group of phototrophs [4]. Nearly 87% of the *Hrs. convoluta* genome content is protein-encoding, with a total of 3263 protein coding genes at an average length of 855 nucleotides (Table 1). The genome contains nine ribosomal RNA (rRNA) genes, including multiple copies each of 5S, 16S (two full and one partial), and 23S rRNA, which are distributed randomly on the chromosome. Nearly 11% of the open reading frames (ORFs) were of unknown enzyme specificity or function, and 28% of genes were annotated as hypothetical. The role category breakdown of protein-encoding genes of *Hrs. convoluta* is shown in Table 2.


**Table 1.** Comparison of genome features of *Heliorestis convoluta* str. HH<sup>T</sup> and *Heliobacterium modesticaldum* str. Ice1<sup>T</sup> [6].

Genes encoding a total of 105 transfer RNAs (tRNAs) were identified in the *Hrs. convoluta* genome, as well as genes encoding all twenty common aminoacyl-tRNA synthetases except asparaginyl-tRNA synthetase, which could not be confirmed. However, genes encoding aspartyl/glutamyl-tRNA amidotransferase (*gatABC*) were identified in *Hrs. convoluta* and, as proposed for *Hbt. modesticaldum* [6], may encode a protein that compensates for the missing asparaginyl-tRNA synthetase by converting aspartyl-tRNA to asparaginyl-tRNA [34,35].


**Table 2.** Functional role categories of *Heliorestis convoluta* str. HH<sup>T</sup> genes.

\* Total exceeds 100%, as some genes are assigned to more than one role category.

#### *3.2. Central Carbon Metabolism*

Analysis of the *Hrs. convoluta* genome confirmed culture-based observations of the limited set of carbon sources able to support light-driven growth of this species [21]. As an obligate photoheterotroph, *Hrs. convoluta* grows only in anoxic, light conditions when supplied with mineral media containing CO2 plus acetate, pyruvate, propionate, or butyrate as organic carbon sources [21]. Of the 12 described species of heliobacteria (Figure 1), only *Heliorestis acidaminivorans*, *Heliorestis daurensis*, and *Hrs. convoluta* are capable of propionate photoassimilation [4,19,21,22]. Genes encoding enzymes of the methylmalonyl pathway, which converts propionyl-coenzyme A (CoA) to succinyl-CoA for propionate assimilation, were identified in the *Hrs. convoluta* genome (Figure 3). Although a gene encoding propionyl-CoA carboxylase, which is thought to catalyze the first step in the proposed pathway [36,37], was not identified in the *Hrs. convoluta* genome, a gene predicted to encode methylmalonyl-CoA carboxyltransferase (FTV88\_3237), which could circumvent this deficiency, was identified.

Unlike other *Heliorestis* species, *Hrs. convoluta* and a few other heliobacteria can use butyrate as a carbon source [19–21,38]. Analysis of the *Hrs. convoluta* genome revealed genes encoding enzymes that catabolize butyrate to acetyl-CoA for incorporation into the citric acid cycle (CAC) [39] (Figure 3). Genes encoding butyryl-CoA:acetate CoA transferase, which catalyzes the conversion of butyrate to butyryl-CoA in butyrate catabolism [39], and propionyl-CoA synthetase, which converts propionate to propionyl-CoA in propionate catabolism [37], were not identified in the genome of *Hrs. convoluta*. However, an experimentally characterized butyryl-CoA:acetate CoA transferase from *Desulfosarcina cetonica* [39] showed 47% amino acid sequence identity with 4-hydroxybutyrate CoA-transferase (FTV88\_0224) from *Hrs. convoluta.* In addition, the product of a gene annotated as acetyl-coenzyme A synthetase (FTV88\_0994) in *Hrs. convoluta* showed 37% sequence identity with propionyl-CoA synthetase from *Salmonella enterica* and contained the conserved lysine residue (Lys592) required in the initial reaction of propionate catabolism [40]. These findings sugges<sup>t</sup> possible roles for 4-hydroxybutyrate CoA-transferase and acetyl-coenzyme A synthetase in butyrate and propionate catabolism, respectively, in *Hrs. convoluta*.

(9) enolase, and (10) pyruvate-phosphate dikinase. CAC enzymes are (11) pyruvate:ferredoxin oxidoreductase, (12) pyruvate carboxylase, (13) phosphoenolpyruvate carboxylase, (14) citrate (*re*)-synthase, (15) aconitate hydratase, (16) NADP+-dependent isocitrate dehydrogenase, (17) 2-oxoglutarate synthase/2-oxoglutarate:ferredoxin oxidoreductase, (18) succinyl-CoA synthetase, (19) succinate dehydrogenase/fumarate reductase, (20) fumarate hydratase, and (21) NAD+-dependent malate dehydrogenase. Acetyl-CoA metabolism is carried out by (22) acetyl-CoA carboxylase and (23) acetyl-CoA synthetase. Butyrate metabolism enzymes include (24) CoA transferase, (25) acyl-CoA dehydrogenase, (26) enoyl-CoA hydratase, (27) 3-hydroxybutyryl-CoA dehydrogenase, and (28) acetyl-CoA C-acetyltransferase. Propionate metabolism is catabolized by (29) CoA transferase, (30) methylmalonyl-CoA carboxytransferase, (31) methylmalonyl-CoA epimerase, and (32) methylmalonyl-CoA mutase. Amino acid metabolism enzymes are (33) NADP+-specific glutamate dehydrogenase, (34) glutamine synthetase, (35) NADPH-dependent glutamate synthase, (36) pyridoxal phosphate-dependent aminotransferase, and (37) asparagine synthase. The enzymes (38) nitrogenase and (39) uptake [NiFe] hydrogenase catalyze nitrogen fixation and H2 oxidation, respectively, and sulfur assimilation is performed by (40) sulfate adenyltransferase, (41) adenylyl-sulfate kinase, (42) phosophoadenylyl-sulfate reductase, (43) bifunctional oligoribonuclease and PAP phosphatase, and (44) adenylate kinase. Finally, the electron transport chain includes (45) ferredoxin:NADP+ reductase, (46) NADH:quinone oxidoreductase, (47) cytochrome *bc* complex, and the (48) light-harvesting

reaction center. Membrane proteins include ABC transporters (yellow), P-type ATPases (black), ATP synthase (pink), flagellar and motor proteins (brown), other

transporters (orange), and other membrane proteins (green).

Although capable of growth on pyruvate, *Hrs. convoluta* str. HH<sup>T</sup> is unable to grow photoheterotrophically on lactate [21], a phenotype distinct from that of most other heliobacteria and the result of an underlying genetic deficiency. In this connection, a gene encoding a putative L-lactate dehydrogenase in *Hbt. modesticaldum* [6] showed no meaningful similarity to any genes in *Hrs. convoluta*. In addition to lactate, no growth was detected when alcohols of any kind were used as sole carbon source in cultures of strain HH<sup>T</sup> [21]. Despite this observation, genes encoding alcohol dehydrogenase and aldehyde dehydrogenase were annotated in the *Hrs. convoluta* genome and, thus, could potentially play a role in non-energetic processes, such as detoxification.

Although a full complement of genes encoding enzymes of the glycolytic and nonoxidative pentose phosphate pathways was present in the *Hrs. convoluta* genome (Figure 3), various common sugars did not support photoheterotrophic growth of strain HH<sup>T</sup> [21]. An inability to use sugars was also originally reported for *Hbt. modesticaldum* [16], but later experimentation showed that *Hbt. modesticaldum* utilized the glycolytic pathway when D-ribose, D-glucose, or D-fructose were supplied with low levels of yeas<sup>t</sup> extract [41]. Although no gene encoding a hexose transporter was annotated in the *Hrs. convoluta* genome, a putative ribose ABC transporter complex (FTV88\_0053, FTV88\_0054, FTV88\_0055) was identified and may allow for carbohydrate transport [41]. As genes encoding glycolytic pathway enzymes are present in the *Hrs. convoluta* genome, it is tempting to speculate that the alkaliphile can utilize sugars in a manner similar to *Hbt. modesticaldum*. The absence of genes encoding glucose 6-phosphate dehydrogenase and 6-phosphogluconolactonase sugges<sup>t</sup> incomplete Entner-Doudoroff and oxidative pentose phosphate pathways, which was also the case for *Hbt. modesticaldum* [6].

It is likely that *Hrs. convoluta* can catalyze many of the steps in the CAC based on biochemical studies of *Hbt. modesticaldum* [42] and high sequence similarity of key CAC enzymes between the two species (Figure 3). However, since both *Hrs. convoluta* and *Hbt. modesticaldum* lack a gene encoding pyruvate dehydrogenase for oxidizing pyruvate to acetyl-CoA, this reaction in heliobacteria is likely catalyzed by the enzyme pyruvate:ferredoxin oxidoreductase (PFOR); the gene encoding PFOR in *Hrs. convoluta* (FTV88\_3370) shares 61% sequence identity to an orthologous gene in *Hbt. modesticaldum* [6]. Furthermore, an unusual citrate synthase, citrate (*re*)-synthase, which specifically catalyzes the addition of the acetyl moiety from acetyl-CoA to the *re* face of the ketone carbon of oxaloacetate [a stereospecificity opposite to that of citrate (*si*)-synthase], has been identified in several clostridia and other strictly anaerobic *Firmicutes*, including *Hbt. modesticaldum* [42]. In *Hrs. convoluta*, a gene (FTV88\_1447) having high amino acid sequence identity (81%) to the gene encoding citrate (*re*)-synthase (HM1\_2993) in *Hbt. modesticaldum* supports the presence of citrate (*re*)-synthase in *Hrs. convoluta* and suggests this unusual form of citrate synthase is common to all heliobacteria.

In regards to photoautotrophic capacity, no genes encoding enzymes of any form of the Calvin-Benson cycle, including ribulose 1,5-bisphosphate carboxylase and phosphoribulokinase, were identified in the *Hrs. convoluta* genome. In addition, the lack of genes encoding key enzymes of other autotrophic pathways, such as malyl-CoA lyase (3-hydroxypropionate/4-hydroxybutyrate pathway) and acetyl-CoA synthase (Wood-Ljungdahl pathway), also prevents *Hrs. convoluta* from assimilating CO2 into organic carbon molecules for growth. The capacity for CO2 fixation by the reverse CAC, as observed in green sulfur bacteria [43], is apparently disrupted by the absence of a gene encoding ATP-citrate lyase. Although an ORF identified as a citrate lyase family protein (FTV88\_0308) was annotated in the *Hrs. convoluta* genome based on sequence identities of approximately 50% with corresponding genes from other *Firmicutes* (but having no similarity to genes in *Hbt. modesticaldum*), biochemical analysis of this gene product would be required to assess its activity and role, if any, in metabolic pathways of *Hrs. convoluta*. Although anapleurotic CO2 assimilation has been shown in heliobacteria supplied with usable organic carbon sources [44], cultures of *Hrs. convoluta* strain HHT, like all other cultured heliobacteria, were unable to grow using CO2 as sole carbon source [21], thus supporting the premise that heliobacteria require an organic carbon source during phototrophic growth.

In addition to phototrophy, neutrophilic heliobacteria are able to grow chemotrophically in the dark by pyruvate fermentation [4]. Interestingly, however, the capacity for pyruvate fermentation has not been observed in any alkaliphilic heliobacterial isolate to date, including *Hrs. convoluta* [4,15,17,21]. Studies have suggested that the neutrophile *Hbt. modesticaldum* carries out substrate-level phosphorylation via acetyl-CoA conversion to acetate in dark, anoxic (fermentative) conditions through the activity of phosphotransacetylase (PTA) and acetate kinase (ACK) [15,17,41]. A gene encoding ACK (FTV88\_2009) was annotated in the genome of *Hrs. convoluta* and has 67% sequence identity to a corresponding gene in *Hbt. modesticaldum*. However, a gene encoding PTA could not be identified in either *Hrs. convoluta* or *Hbt. modesticaldum*. Therefore, the genetic determinants that coordinate pyruvate fermentation in neutrophilic heliobacteria but are apparently absent from alkaliphilic heliobacteria remain unidentified.

Three *Hrs. convoluta* genes encoding acetyl-CoA synthetase (ACS) were identified in the genome, one of which showed 87% amino acid sequence identity with the corresponding gene in *Hbt. modesticaldum*. Activity of ACS in *Hbt. modesticaldum* cell extracts was detected only under phototrophic (light/anoxic) conditions, and expression levels of the ACS gene decreased when the bacterium was cultured in darkness [41], thus indicating that, although technically reversible, ACS activity is predominately skewed toward the production of acetyl-CoA from acetate (Figure 3). Activity of ACS therefore allows both *Hbt. modesticaldum* and *Hrs. convoluta* to grow photoheterotrophically using acetate as sole carbon source [16,21].

In contrast to all other heliobacteria, which require biotin for growth, *Hrs. convoluta* and close relative *Hrs. acidaminivorans* (Figure 1) have no growth factor requirements [21,22]. The presence of a full complement of genes (*bioABCDF*) encoding enzymes for biotin biosynthesis allows *Hrs. convoluta* to synthesize biotin, thereby supporting culture-based observations [21]. By contrast, analysis of the *Hbt. modesticaldum* genome revealed the absence of two key genes for biotin biosynthesis, *bioC* and *bioF*, thus explaining the absolute requirement for biotin in that species [16].
