Aldehyde Dehydrogenase Diversity in Azospirillum Genomes

Ricardo Cuatlayotl-Olarte; María Luisa Xiqui-Vázquez; Sandra Raquel Reyes-Carmona; Claudia Mancilla-Simbro; Beatriz Eugenia Baca; Alberto Ramírez-Mata

doi:10.3390/d15121178

,

and

¹

Laboratorio de la Interacción Bacteria-Planta, Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico

²

Instituto de Fisiología, Benemérita Universidad Autónoma de Puebla, Puebla 72570, Mexico

^*

Author to whom correspondence should be addressed.

Diversity2023, 15(12), 1178;https://doi.org/10.3390/d15121178

This article belongs to the Section Phylogeny and Evolution

Version Notes

Order Reprints

Review Reports

Abstract

Aldehyde dehydrogenases (ALDHs) are indispensable enzymes that play a pivotal role in mitigating aldehyde toxicity by converting them into less reactive compounds. Despite the availability of fully sequenced Azospirillum genomes in public databases, a comprehensive analysis of the ALDH superfamily within these genomes has yet to be undertaken. This study presents the identification and classification of 17 families and 31 subfamilies of ALDHs in fully assembled Azospirillum genomes. This classification system framework provides a more comprehensive understanding of the diversity and redundancy of ALDHs across bacterial genomes, which can aid in elucidating the distinct characteristics and functions of each family. The study also proposes the adoption of the ALDH19 family as a powerful phylogenetic marker due to its remarkable conservation and non-redundancy across various Azospirillum species. The diversity of ALDHs among different strains of Azospirillum can influence their adaptation and survival under various environmental conditions. The findings of this study could potentially be used to improve agricultural production by enhancing the growth and productivity of crops. Azospirillum bacteria establish a mutualistic relationship with plants and can promote plant growth by producing phytohormones such as indole-3-acetic acid (IAA). The diversity of ALDHs in Azospirillum can affect their ability to produce IAA and other beneficial compounds that promote plant growth and can be used as biofertilizers to enhance agricultural productivity.

Keywords:

Azospirillum; aldehyde dehydrogenases; phylogeny; pangenomics; abiotic stress

1. Introduction

Aldehydes are ubiquitous organic compounds found in nature, they can be produced endogenously through processes associated with the metabolism of amino acids [e.g., arginine, proline, lysine, and valine], alcohols, lipid peroxidation (LPO), and carbohydrate oxidation, among others. Although aldehydes are crucial intermediates in various metabolic pathways, excessive accumulation can lead to toxicity, and environmental factors that induce cellular stress can trigger the accumulation of aldehydes beyond a certain threshold, resulting in cytotoxicity [1,2,3,4].

Aldehyde dehydrogenases (ALDHs) are a superfamily of enzymes present in both prokaryotic and eukaryotic organisms. They use NAD(P)⁺ as a cofactor to catalyze the conversion of aldehydes (Figure S1), including fatty, aromatic, and terpenoid aldehydes, into less reactive molecules. ALDHs play a vital role in protecting cells from aldehyde toxicity. Aldehydes are highly reactive molecules that can form adducts with DNA, RNA, and proteins, disrupting cellular homeostasis, deactivating enzymes, and damaging DNA [3,4,5,6]. In addition to their role in detoxification, ALDHs are involved in the synthesis and elimination of a variety of important molecules, including vitamins, amino acids, steroids, betaine, retinoic acid, and gamma-aminobutyric acid (GABA) [6,7].

ALDH can catalyze three types of reactions, all of which result in the conversion of aldehydes to carboxylic acids, coenzyme A acyls, or phosphate acyls [4,8].

ALDHs are homo-dimeric, homo-tetrameric, or homo-hexameric enzymes; most ALDHs consist of a single domain containing 450–550 amino acids, but some families have additional domains that enable them to catalyze multiple reactions. The ALDH domain can be further divided into three structural domains: a cofactor (NAD or NADP) binding domain, a catalytic domain, and an arm-shaped oligomerization domain (Figure S2) [4,8,9,10,11].

The catalysis of aldehydes into carboxylic acids by ALDH2 and other ALDH family members involves five steps: activation of the catalytic thiol, a nucleophilic attack on the electrophilic aldehyde, formation of a tetrahedral thiohemiacetal intermediate, hydrolysis of the resulting thioester, and dissociation of the reduced cofactor (NADH or NADPH) and regeneration of the enzyme by binding to NAD(P). These steps are mediated by the amino acids Glu268 and Cys302 or their equivalents in other ALDH family members [4,12].

There is great diversity in the ALDH superfamily. In eukaryotes, there are 18 families composed by 35 subfamilies, in humans there are 19 ALDHs classified in 10 families, while in plants there are 14 families [13,14].

Compared to eukaryotes, the diversity of ALDHs in prokaryotes has been less well-studied. One study of 258 Pseudomonas strains found 6510 ALDHs which could be grouped into 42 families. Fourteen of these families accounted for 76% of all ALDHs [8].

Aldehyde dehydrogenases are classified based on sequence identity. ALDHs with at least 40% sequence identity or more are grouped into the same family, while those with at least 60% sequence identity are classified into the same subfamily. Subfamilies and numbers of ALDHs are designated chronologically [6]. This classification system, which is based on the recommendations of Margaret Dayhoff, is also used for more than 130 other protein superfamilies [7,15]. However, Dayhoff’s methodology has limitations when used to classify ALDHs from microorganisms. When applied to these prokaryotic proteins, new families emerge beyond the current classification. This is likely due to the large number and diversity of ALDHs found in microbial genomes [8].

The Azospirillum genus comprises land-dwelling microorganisms that typically inhabit damp environments, such as sludge, surface waters, and cultivated soils. These non-spore-forming bacteria are spiral or slightly curved rods that contain polyhydroxybutyrate granules. They stain Gram-negative, and are highly motile due to a single polar flagellum and several shorter lateral flagella. It has been proposed that these Rhodospirillaceae bacteria colonized terrestrial habitats around 200–400 million years ago, coinciding with the emergence of vascular plants [16,17]. This genus includes species that establish a mutualistic relationship with plants and promote their growth. This growth-promoting effect is generally attributed to the ability of these bacteria to fix nitrogen and produce indole-3-acetic acid (IAA). While both abilities are important, nitrogen fixation is typically associated with all known plant-associated strains of this genus, whereas IAA production has only been observed in the species A. brasilense, A. lipoferum, A. argentinense, A. formosense, and A. baldaniorum [16,17].

The most well-studied species of Azospirillum have been isolated from crops of agricultural interest, such as wheat, corn, soybean, tomato, strawberry, and sugar cane. However, some species have also been isolated from soils contaminated with oil, bacterial fuel cells, geysers, and sulfurous waters. This indicates that Azospirillum can tolerate and adapt to a variety of stressful conditions, likely due to the presence of various stress-tolerant mechanisms [18]. Additionally, it has been observed that introducing A. brasilense into the roots of specific plants can reduce the build-up of reactive oxygen species. This capacity has been linked, among other factors, to the production of IAA by the bacterium [19,20].

Azospirillum bacteria have been extensively used as model organisms to study mutualistic interactions between plants and bacteria. While most research has focused on their plant-growth-promoting abilities, their ability to adapt to different environments and stresses is not well understood [19]. This study aims to explore how the diversity of ALDHs among different strains of Azospirillum affects their adaptation and survival under various environmental conditions.

There are no published studies on the diversity of ALDHs in the genomes of Azospirillum genera. Therefore, we used a bioinformatics approach to search for proteins with the ALDH domain in 17 complete Azospirillum genomes that are available in public databases [20]. The recovered nucleotide and protein sequences were analyzed using bioinformatic tools to classify them according to their phylogeny and probable function.

2. Materials and Methods

2.1. Genome Selection

We retrieved Azospirillum genomes from the NCBI Assembly Database [19]. Only genomes with a complete assembly level in January 2022 were downloaded in GenBank Flat File (GBFF) format for further analysis.

2.2. Aldehyde Dehydrogenases Sequence Searching

To obtain a complete collection of protein-coding sequences, we parsed GBFF files and stored all coding sequences annotated as ORFs in FASTA files, separating amino acid sequences from nucleotide sequences.

To identify proteins containing the ALDH domain, we performed local alignments against the protein collection on the BLAST platform [21], using the aldA gene from Pseudomonas syringae DC3000 as the query sequence [22]. We set an E-value threshold of 0.05 and stored all hits in FASTA files for classification, alignment, and phylogenetic analysis [23].

2.3. Comparative Genome Analyses

To cluster related genes in the pangenome, we used the Anvi’o 7.1 software with the DIAMOND and MCL algorithms. We set the minbit parameter to 0.5 and the mcl inflation parameter to 0.5 to classify and cluster the genes [24]. We then used the Conserved Domains Database to predict the corresponding ALDH family for each cluster [25,26]. All curated data is provided in the attached file named curated_database.xlsx. It is available in the Supplementary Material.

2.4. Multiple Sequence Alignment and Phylogenetic Analyses

Multiple sequence alignment of the ALDH sequences was performed using the iterative MUSCLE algorithm in MEGA X software [27,28]. The global identity matrix was obtained by analyzing alignment in UGENE v.46.0 software [29]. Using all sites in gaps/missing treatment, and a moderate branch swap filter, we searched for the best substitution model for maximum likelihood analysis using MEGA. Based on their lower BIC’s values [30], we selected the LG+G+F model for amino acid sequences, and the GTR+G model for nucleotide sequences.

We constructed a phylogenetic tree from the amino sequence alignment using the maximum likelihood method with the LG+G+F model MEGA X software [27], with 1000 bootstrap replicates [31].

2.5. Protein Modelling, Molecular Conservation, and Structural Analysis

Aldehyde dehydrogenases amino acid sequences from the Azospirillum genus were retrieved from the Uniprot database [32]. Representative members of each ALDH family with structures modeled by Alphafold and deposited in Uniprot [33] were selected. These structures were downloaded and stored in PDB format. The amino acid sequences of the downloaded models were compared using the UGENE v.46.0 software [29], the catalytic residues (Cysteine) of each sequence were identified, and the glutamate residues corresponding to the activator residue were also located.

Using the ChimeraX software (v.1.6) [34], the PDB models were analyzed, including structural alignment (matchmaking). Subsequently, the visualization was intended to show the positions of the catalytic and activator residues.

2.6. Aldehyde Dehydrogenase Gene Localization in Azospirillum Genomes

To identify the positions of ALDHs loci within each Azospirillum genome, we wrote several shell scripts to compile the locations of each ALDH in all contigs. We then exported this information to CSV files for further analysis, and generated graphs using the RAWGraphs platform [35].

2.7. Aldehyde Dehydrogenase Gene Identification as Potential Phylogenetic Markers

We searched for potential phylogenetic markers within single-copy core genes (SCGs) in the pangenomic analysis performed with Anvi´o software. We constructed a maximum likelihood phylogeny and compared it to the results obtained for the rpoD gene, which has been previously used as a phylogenetic marker for the Azospirillum genus [36].

3. Results

3.1. Selection of Azospirillum Genomes

From the Genome Data Bank (NCBI), we selected 17 genomes with a complete assembly level (Table 1). The genome sizes ranged from 6.32 to 8.1 Mpb, and each genome included chromosome and plasmids sequences (contigs). The strain with the lower number of contigs was A. brasilense Az39 (6 contigs), while the strain with the highest number of contigs was A. sp TSA2s (10 contigs) (Figure 1).

Table 1. Information of Azospirillum strains used in the study.

Figure 1. Correlation analysis was performed to assess the relationship between the number of contigs and ALDH abundance. The numbers inside of circles correspond to genome size [measured in Megabase pair (Mbp)]. In this analysis, 17 Azospirillum genomes retrieved from NCBI database were considered.

3.2. Number of Aldehyde Dehydrogenases Identified for Each Azospirillum Strain

We identified a total of 315 ALDH sequences, with an average of 18.5 ± 3.7 aldehyde dehydrogenases per genome. The strain with the lowest number of ALDHs was A. ramasamyi M2T2B2 (12 homologs), while the strain with the highest number of ALDHs was A. sp TSH100 (27 homologs).

We did not observe a correlation between the number of contigs and the abundance of ALDHs, suggesting that the number of ALDHs are not associated with horizontal gene transfer through the acquisition of exogenous genetic material. Also, we did not find a correlation between genome size and the number of ALDHs.

The average length of ALDH proteins was 556 ± 190 amino acids (aa). Since the literature reports an approximate size of 500 aa for the ALDH domain, we constructed a length frequency histogram and found three groups. The first group had a length between 396–559 aa, the second group between 884–916 aa, and third group between 1237–1252 aa. We analyzed the sequences in the two larger groups using the Conserved Domains Database [25,26]. The second group belongs to superfamily of alcohol dehydrogenases/aldehyde dehydrogenases, and its length is due to an additional domain with alcohol dehydrogenase (ADH) activity [46]. The third group belongs to the proline dehydrogenases/ɣ-glutamyl aldehyde dehydrogenases superfamily and these proteins also have an additional domain named proline dehydrogenase (PRODH) [49] (Figure 2).

Figure 2. Length frequency distribution of proteins with ALDH domain.

3.3. Sequence Alignment and Clustering of Aldehyde Dehydrogenases

The identity matrix from the global alignment shows a wide range of identity (3–100%) (Table S1). To achieve a reliable clustering of ALDH sequences, not based only on identity percentage, we used the Anvi’o software [24], which allows us to cluster code genes based on progressive alignments and the Markov clusters algorithm (MCL) [24]. The 315 ALDH sequences were grouped into 31 clusters of orthologous genes (COG’s). We used the Conserved Domains Database (CDD) [25] to identify the ALDH family of each COG. This resulted in 31 COGs being grouped into 17 families (Table 2). Fifteen ALDH families belonged to the aldehyde dehydrogenase superfamily (ALDH-SF, CDD: cd06534), which is characterized by a single ALDH domain. One ALDH family belonged to the proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase superfamily (ADH/ALDH, CDD: PRK00197), and the last family belonged to the acetaldehyde-CoA/alcohol dehydrogenase superfamily (PRODH/ALDH, CDD: PRK11905). Members of families with additional domains are proteins with divergent sizes, as shown in the histogram above. The distribution of subfamilies and identity per subfamily is shown in Figure 3.

Table 2. Aldehyde dehydrogenase gene families in the Azospirillum genus and their probable function.

Figure 3. Three-level pie chart showing the distribution of ALDH families, subfamilies, and sequence identity within each subfamily.

3.4. Alignment and Phylogenetic Analysis

Using the maximum likelihood method, we generated a phylogenetic tree from the previously conducted alignment (Figure 4). The tree showed clustered branches that maintained the relationships established through the pangenomic analysis and identification carried out in CDD [24,25,26]. Of the 17 families identified, 10 exhibited a shorter branch distance than the remaining part of the tree, which suggests a shared ancestor. These families were identified as potential ALDHs that transform aldehydes into carboxylic acids (Blue group in Figure 4) [52,58,60]. The DhaS, COG1012, and ALDH MSR1-LIKE families were also found in this group (Red group in in Figure 4), with a close identity of about 40% between groups. Notably, no literature was found for COG1012 and ALDH MSR1-LIKE indicating information on the possible substrates and products of these families. It would be beneficial to conduct future studies that test the substrates of aldehyde dehydrogenases that are closely related in terms of phylogenetics, as previous research has found evidence of substrate promiscuity among various aldehyde dehydrogenases [58]. The other seven families had ALDHs capable of producing carboxylic acids (KGS-ALDH and ALDH7), acylating ALDHs (ALDH7), phosphorylating ALDHs (ALDH19), ALDHs with an additional domain (ALDH16 and ALDH20), and ALDHs involved in the degradation of indole acetic acid (PaaZ). For the ALDH 07078 family, no literature was found that provided information on its possible function [11,49,50,64,70,71,72,73,74]. According to the pangenomic analysis, the strains have comparable varieties of ALDH, as illustrated in Figure 5.

Figure 4. Phylogenetic tree constructed with 315 ALDH protein sequences from strains of the Azospirillum genus, showing families and clusters identified using the Conserved Domain Database CDD platform (1000 bootstrap).

Figure 5. Number of genes in each ALDH family in 17 complete Azospirillum genomes, with a dendrogram above the table showing the phylogenetic relationship between the strains, as determined using Anvi’o 7.1 software; green and red shadings correspond to higher and lower ALDHs values, respectively.

3.5. Comparison of Structural Models of Aldehyde Dehydrogenase Families

17 Alphafold-generated representative models, one per each ALDH family, were obtained from Uniprot Database and were superimposed. The most notable structural changes were found in the oligomerization domain; the PaaZ family stands out due to the larger size of this domain compared to other single-domain ALDHs. In experimentally characterized proteins from this family, the extended length of this domain allows them to form trimers instead of dimers, which are formed by the rest of the families [11,66].

According to our alignment results (Figure 6), the catalytic cysteine residue is in a similar position in all structures, while the catalytic glutamic acid is present in most families, except for ALDH6, ALDH19, and ALDH20. Despite an extensive search, we were unable to find any literature that elucidates the activation mechanism of the catalytic cysteine residue for these families. However, the crystal structures of these ALDHs available in the Protein Data Bank (PDB) demonstrate the absence of the acidic glutamic acid residue that is typically observed in other ALDH families [53,71,75,76,77]. In terms of the Rossmann fold, almost all structures exhibit a five-beta sheet configuration. The exception is the ALDH19 family, which has a four-beta sheet structure (black structure in Figure 7). This is consistent with the crystal structures of this family [77,78].

Figure 6. Multiple sequence alignment of the ALDH family of the Azospirillum genus. Multiple sequence alignment of the ALDH families in the Azospirillum genus was constructed using UGENE software. The ALDH consensus sequence is shown above the alignment. The sequences are denoted by their protein ID, followed by the species abbreviation and the ALDH family. This alignment was used in the superposition analysis of the ALDH models obtained by AlphaFold. The first residue highlighted corresponds to glutamic acid (gray), which activates the catalytic cysteine (purple), which is highly conserved in most ALDH families.

Figure 7. Structural comparison of superimposition of Azospirillum ALDH models (A) Superimposition of the ALDH models obtained by AlphaFold to detect the Rossman fold that integrates the NAD(P)⁺ coenzyme (highlighted in a red square box), the catalytic cysteine, and glutamic acid amino acids, (which interact with the aldehyde substrate and drive the ALDH enzymatic reaction) (panel (B)).

3.6. Analyzing Aldehyde Dehydrogenase Families Found in Azospirillum Genomes

Next, we describe the characteristics and data of the aldehyde dehydrogenases families identified in this study, and discuss their potential impact on the metabolic activity within the genus Azospirillum.

ALDH5

We identified 38 ALDHs belonging to the ALDH5 family (CDD-ID: cd07103), each with one to four paralogs per genome. This family is also part of the core genes of the genus Azospirillum. Our phylogenomic analyses revealed three subfamilies within the ALDH5 family, with an overall identity of 66.9% ± 19.5% between the proteins of this family. This indicates variability between subfamilies. Notably, 35 of the 38 ALDHs in this family are part of CSF-1, while the other subfamilies are only present in A. oryzae KACY14407, A. thiophilum BV-S, and A. sp. TSA2s. In humans, ALDH5 participates in the conversion of succinate semialdehyde (SSA) into succinate. SSA is produced from the decomposition of ɣ-aminobutyric acid (GABA), a 4-carbon non-protein amino acid, which is a neurotransmitter that inhibits stress signals in the central nervous system. ALDH5 has also been implicated in the metabolism of 4-hydroxy-2-nonenal (4-HNE), a molecule produced by lipid peroxidation that causes oxidative stress in cells [79].

6-OL-ALDH

This family of ALDHs (CDD:ID: cd07138) is present in the A. brasilense strains, 15 of the 17 analyzed genomes. We identified two subfamilies with an average identity of 47.3% ± 39.7%. The phylogram shows high diversity, but all the sequences appear to have a common ancestor. According to the Conserved Domain Database, these ALDHs are related to the 6 oxolauric-aldehyde dehydrogenases of the bacterium Rhodococcus ruber SC1, where a protein of this family (cddD) converts 12-oxolauric acid into dodecanoic acid [55].

MSR1, DhaS, and COG1012 group

It is important to note that the DhaS, MSR1-like, and COG1012 families share common ancestors with 6-oxo-lauric aldehyde dehydrogenases and type II aldehyde dehydrogenases. Classifying this group of proteins was challenging, as they displayed E-value scores of 0 or near zero for identical groups in a random manner when categorized in the Conserved Domains Database (CDD). This issue will be explored in greater detail in further discussion.

MSR1-ALDH-LIKE

This family of ALDHs (CDD-ID: cd07108) is present in the strains of A. brasilense, A. thermophilum CFH70021, and A. sp. TSH58, having between one and three paralogs per strain. This family has an average identity of 83.8% ± 13.5% and is grouped into a single subfamily. These sequences are related to an ALDH of the aquatic bacterium Magnetospirillum gryphiswaldense MSR-1, but there is no experimental evidence for its function [57].

DhaS (cd7114)

We identified 18 proteins associated with the Pseudomonas putida ALDH DhaS, which have also been discovered in Bacillus amyloliquefaciens, where they are classified as a potential indole-acetaldehyde dehydrogenase [58,59]. These related proteins are divided into five subfamilies, with an overall identity of 54.7% ± 21.3%. In our phylogenetic tree, they are distributed across two distinct nodes, one of which has a closer connection to the ALDH2 family. The ALDH2 family also includes ALDHs with indole acetaldehyde dehydrogenase activity, suggesting convergent evolution between the two subfamilies [58,59,60].

ALDH COG1012

Phylogenetic analysis of the 18 sequences in this family of proteins revealed six subfamilies. These subfamilies are evolutionarily close to the families cd07114, cd07108 (DhaS), and cd07559 (ALDH2). There is no experimental evidence to define the substrates that these ALDHs can use. The average identity of these ALDHs is 50.8% ± 22.7%. They are present in 11 of the 17 genomes analyzed, with one to three paralogs per genome. NAD(P)⁺ binding motifs were found, but it is unclear whether these ALDHs are acylating or oxidizing.

KGSADH

This ALDH family (CDD-ID: cd07097) has not been classified according to the proposed nomenclature for ALDHs. Phylogenetic analysis revealed 52 ALDH distributed into three subfamilies, with an average identity of 52.8% ± 29.6%, highlighting the high variation in identity between subfamilies. The clustering was confirmed using the Conserved Domains Database [25,26]. The number of paralogs of this family in each genome is one to six. Cluster subfamily 1 (CSF-1) is present in all strains with at least one paralog, while CSF-3 is only present in A. sp TSA2s. No paralog of this family has been found in eukaryotes. In A. brasilense, these proteins could participate in arabinose metabolism by converting α-ketoglutarate semialdehyde into alpha-ketoglutarate. These proteins prefer NAD⁺ over NADP⁺, and use glutaraldehyde and succinic semialdehyde as substrates [50,51].

ALDH6

This family (CDD-ID: cd cd07085) is formed by the methylmalonate semialdehyde dehydrogenases. We identified 30 homologs of these ALDHs, which are present in all strains analyzed, with one to four paralogs per genome. They form a single subfamily with an average identity of 71.7% ± 16.0%. These proteins have been found in prokaryotic and eukaryotic organisms, and participate in the distal metabolism of valine. Using NAD⁺ and CoA, they convert methylmalonate semialdehyde to propionyl-CoA [53,54].

ALDH20

We identified 19 homologs of this family (CDD-ID: PRK13805) in the analyzed Azospirillum genomes. The A. lipoferum 4B strain has one paralog of this family, while the paralog found in A. humireducens SgZ-5 (WP_236783856.1) is a partial sequence that has only the ALDH domain. An alignment of this partial sequence with the WP_108547921.1 sequence of the same strain, reveals an identity of 99%, suggesting that the partial protein may have originated through a duplication event of the complete gene. This family consists of two domains: the first with aldehyde dehydrogenase activity (ALDH) and the second with alcohol dehydrogenase activity (ADH). In fermentative bacteria, this family of proteins participates in the anaerobic production of ethanol from acetyl-CoA, with acetaldehyde as an intermediate. These proteins are usually found in the form of spirosomes, which can acquire various conformational states. They are also sensitive to metal-catalyzed oxidation [49,55], and functional mutations in these proteins have been shown to cause greater sensitivity to heat [56].

ALDH2

We identified 17 ALDHs of this family (CDD-ID:cd7559), with one homolog in each genome, indicating that this family is part of the core of the analyzed genomes. Additionally, our analyses revealed a high identity (90.9% ± 6.2%) among the members of this family. In the bacterium A. brasilense Sp7, experimental evidence has shown that the protein aldehyde dehydrogenase WP_035679551.1 participates in the conversion of acetaldehyde into acetic acid [22]. Notably, this protein has 97% aldehyde dehydrogenase from the bacterium A. brasilense Yu62 [60], which participates in the conversion of indole acetaldehyde into indole acetic acid. Additionally, it has been observed that when A. brasilense Sp7 is in the form of a cyst, the transcription levels of this protein are reduced, which can lead to decreased levels of pyruvate and acetyl-CoA within the cell [61]. In mammals, members of this family are found in the mitochondria and are responsible for converting acetaldehyde to acetic acid in the cells of various organs, including the brain. Deficiency of this protein is associated with poor alcohol metabolism and development of Alzheimer’s and Parkinson’s diseases [2,80].

ALDH16

The members of this family (CDD-ID:PRK11905) are proline dehydrogenases/pyrroline-5-carboxylate dehydrogenases. In Azospirillum, we found one homolog in each genome, indicating that it is part of the core genome. This family has an average identity of 84.2% ± 9.1%. In the free-living bacterium Sinorhizobium meliloti, this protein, known as PutA, participates in the conversion of L-proline into glutamate through the sequential action of proline dehydrogenase and delta-pyrroline-5-carboxylate dehydrogenase activities, which are conferred by the PRODH and ALDH domains present in this family of ALDHs [62].

ALDH19

This family of ALDHs (CDD-ID:PRK00197) is known as the glutamate-5-semialdehyde dehydrogenases. They require NAD⁺ for activity and have been found in mammals, plants, and bacteria. In mammals, this family is related to ALDH4 and ALDH12, respectively, but due and to the less than 30% identity, they are grouped into different families in our analyses, even though they have similar functional motifs and participate in the same pathway, proline catabolism [63]. These proteins convert glutamate-5-semialdehyde into glutamic acid. No additional paralogs were found in the analyzed genomes, and this family has an average identity of 91.3% ± 5.1%. It is classified into a single subfamily and is part of the core genome of the analyzed genomes.

ALDH7

The 16 ALDHs of this family (CDD-ID:cd07130) form a single subfamily with an average identity of 89.5% ± 8.0%. One paralog was found in each genome except for A. thermophilum CFH70021, which lacks the ALDHs of this family. These ALDHs, which are also present in plants and animals, are characterized by their ability to convert α-aminoadipic semialdehyde into α-aminoadipic acid. They typically form tetrameric structures [65,72].

ALDH CD7078

We found 14 members of this family, although the group identity has a high variability of 47.3% ± 39.7%. The sequences were grouped into a single subfamily, with one member in each strain of the species A. brasilense, A. ramasamyi, A. humicireducens, A. baldaniorum, A. oryzae, and A. sp. We did not find experimental reports that reveal the possible substrates or metabolic pathways in which these enzymes participate.

Paaz

This bifunctional enzyme participates in pathways that include the degradation of phenylacetic acid. It catalyzes the opening of oxepin-CoA to convert it into 3-oxo-5,6-dehydrosuberyl-CoA. This enzyme has a longer terminal carboxyl zone that in other classic ALDHs. This additional region allows the protein to form trimers, as well as a hydrophobic tunnel through which the enzyme receives its substrate and carries out the conversion. The enzyme is susceptible to spontaneous conversion to 2-hydroxycyclohepta-1,4,6-triene-1-carboxyl-CoA in the absence of NAD(P)⁺ [66]. We found this ALDH in Azospirillum strains that do not synthetize indole acetic acid, suggesting that these strains may be able to use IAA as a carbon source. However, the validation of this hypothesis will require further investigation in subsequent studies.

ALDH cd07105

We identified a group of five ALDHs in this family that can be classified together, with an average identity of 72.38%. These ALDHs are present in A. thermophilum, A. sp B510, and A. humireducens SgZ5, as well as two homologs in A. sp TSH100. This ALDH family is part of an operon that contains up to five enzymes that catalyze the conversion of naphthalene into salicylate [67].

ALDH cd07120

A member of this family was detected in two Azospirillum strains, A. thermophilum CFH70021 and A. sp B510, with a high similarity of 96% between the two members. This family is similar to the psfA gene of Pseudomonas putida Fu1, which participates in the conversion of furfural [68].

ALDH cd07152

A member of this family was identified in A. sp TSH100 and A. sp B510, with 88% identity between the two sequences. This member is similar to benzaldehyde dehydrogenase II of Acinetobacter calcoaceticus [69], suggesting that it may have a similar function.

3.7. Aldehyde Dehydrogenases as Phylogenetic Marker

We examined the location of aldehyde dehydrogenases in the bacterial genomes of the genus Azospirillum to determine whether they were located on the chromosome or on any of the chromids. This investigation was motivated by studies that suggest that chromids typically experience changes in their GC composition over time, and that genes on the chromosome may undergo fewer changes than genes that have been translocated to a chromid [81].

Most ALDH sequences are distributed on chromosomes, chromids, and even plasmids. The ALDH2 and ALDH16 families, which belong to the SCG, are also distributed in this manner. However, the ALDH19 family is an exception, as its members putatively encode for phosphorylating aldehyde dehydrogenases. Of the seventeen genes associated with this family, sixteen remain on the chromosome, while one is present on chromid 1 (in A. thermophilum CFH70021) (Figure 8).

Figure 8. Alluvial diagram showing the localization of Azospirillum genus ALDHs on the chromosome (chr) or other replicons as chromids and plasmids (p01 to p07).

The persistence of this gene on the chromosome and its previously encountered identity has led to its consideration as a potential phylogenetic marker. So, the sequences can subsequently be compared to the rpoD gene, a well-established phylogenetic marker in Azospirillum.

In addition to the aldH19 and rpoD genes from the Azospirillum genus, sequences from the other alphaproteobacteria and the gammaproteobacteria P. syringae DC3000 were included in the analyses. This decision was based on the discovery that both aldH19 and rpoD are SCG genes in these genomes, and their location on the chromosome was confirmed for genomes with more than one contig. Interestingly, the comparison of the phylogenetic trees of aldH19 and rpoD genes revealed that aldH19 has undergone five times fewer changes over time than the rpoD gene. The branch position of aldH19 was retained, establishing a phylogenetic relationship between the alphaproteobacteria and a larger branch for the gammaproteobacterium P. syringae DC3000. Additionally, the branches in the aldH19 phylogenetic tree better grouped individuals of the same species (Figure 9).

Figure 9. Phylogenetic marker utility of aldH19 gene. A comparison of phylogenetic trees from rpoD and aldH19 genes.

4. Discussion

The Azospirillum genus exhibits remarkable resilience and adaptability to a wide range of environmental stressors, including oil-contaminated soils, bacterial fuel cells, geysers, and sulfurous waters [40,43]. However, exposure to these stressors can lead to increased aldehyde levels, which can endanger bacterial viability. Other environmental factors that can affect the viability of these bacteria include fluctuations in temperature, pH, salinity, and nutrient availability. The diversity of ALDHs within the Azospirillum genus is likely to play a crucial role in their adaptation and survival under these such stressful conditions [42,44,47]. Azospirillum potentially acquired genetic material from coexisting organisms within their shared habitat through horizontal gene transfer [16]. This genetic exchange could have significantly influenced the genus’s ability to undergo evolutionary adaptations, leading to the development of novel metabolic pathways, and the production of a wide range of products and intermediates, including aldehydes. This, in turn, may have contributed to enhanced environmental adaptability. The evolution or acquisition of ALDHs may have been fundamental in the survival of strains capable of utilizing environmental nutrients and responding effectively to stress. Greater diversity of ALDH families facilitates the clearance of different aldehyde types, underscoring the importance of maintaining a sufficient variety of ALDHs for bacterial homeostasis [1,82].

This study highlights the need for a revised, universally applicable classification system for ALDHs. The current system works for eukaryotes, but it is inadequate for prokaryotes, given the growing diversity of ALDHs discovered through ongoing research. As a result, novel ALDHs are being classified without following established guidelines, leading to inconsistencies in nomenclature, numbering, structure, and function across the currently defined ALDH families—given the growing diversity of ALDHs discovered through ongoing research [4,8,83].

A major challenge in accurately classifying ALDHs is the limitation of platforms such as SMART [84], which only detect the ALDH domain and can therefore only determine whether a given sequence belongs to the ALDH superfamily. Additionally, while the literature generally cites ALDH sizes of approximately 500 amino acids, certain families we found, such as PaaZ, have a larger carboxy-terminal region that enables the formation of homo-hexamers by combining three homo-dimeric structures, deviating from the more commonly reported homo-tetrameric structures [66]. Additionally, some studies incorporate ALDHs with multiple domains, but fail to address the consequential impact of these additional domains on the global functionality of the protein [8].

We employed pangenomic and phylogenetic clustering techniques, specifically the maximum likelihood approach, to classify ALDH families within the Azospirillum genus. These methods consistently preserved established groupings present in the CDD database, providing strong corroboration for the assigned CDD codes. Subsequently, we undertook a comprehensive literature review to cross-reference the probable functions and, in certain instances, performed identity comparisons with sequences of experimentally validated ALDHs.

We also confirmed the utility of emerging artificial-intelligence-based platforms, such as Alphafold, for structural approximations and comparative analyses of diverse ALDH families. Notably, these platforms allowed us to examine critical residues such as the catalytic cysteine and activating glutamate, as well as to evaluate ALDHs with extended oligomerization regions or additional domains.

Our study identified ALDHs with single-domain and bi-domain structures. Among the single-domain enzymes, we found enzymes that can generate products that can be incorporated into the tricarboxylic acid (TCA) cycle. For example, the ALDH5 family can generate succinic acid, an intermediate in the TCA cycle [52,85]. The ALDH6 family can generate propionyl-CoA and acetyl-CoA, which are both precursors of succinyl-CoA, another intermediate in the TCA cycle [53,54].

The 6-OL-ALDH family is a group of enzymes that can metabolize long-chain aldehydes. But it does not have a formal family number in the ALDH numbering system. Our results show that the 6-OL-ALDH enzyme from Azospirillum is closely related to other aldehyde dehydrogenases that have been shown to metabolize long-chain aldehydes, such as the AldC enzyme from P. syringae DC3000. The AldC enzyme has been shown to efficiently metabolize aliphatic aldehydes with five to nine carbons [86]. Although there is no experimental evidence to support the function of the 6-OL-ALDH family in Azospirillum, our data suggest that this family of enzymes may have biotechnological potential, especially in the field of bioremediation. This is because the 6-OL-ALDH family may be able to metabolize hydrocarbons that share the same metabolic pathways as long-chain aldehydes derived from oils [87].

Despite potential shared metabolic pathways between IAA and Phenylacetic acid, initial phenylalanine deamination in plants appears to be catalyzed by specific enzymes [88]. Therefore, it is crucial to gather experimental evidence regarding the distinct routes through which this phytohormone is synthesized in each microorganism, and to clarify the potential involvement of specific and nonspecific ALDHs. Some of the ALDHs we identified could play a significant role in the metabolism of both phytohormones [89]. This is relevant to the Azospirillum genus because species A. brasilense, A. baldaniorum, and A. argentinense positively regulate plant growth through IAA, which is produced principally through the indole-3-pyruvate (IPyA) pathway, with the final step of IAA synthesis carried out by aldehyde dehydrogenases [90,91,92,93]. We found families with high identity to experimentally evaluated ALDHs that can perform this function, such as the ALDH2 and DhaS families. The ALDH2 family has high identity with AldA from A. brasilense Yu62 and AldA of P. syringae DC3000 (~90% and ~40%, respectively). In the first case, a mutation of the A. brasilense Yu62 aldA gene resulted in a ~40% reduction in the biosynthesis of this phytohormone [60], while in the second case, the ability to convert indole-3-acetaldehyde (IAC) into IAA was confirmed in P. syringae DC3000, with AldA producing more IAA than the other two ALDHs (AldB and AldC) [58].

In the DhaS family, a related ALDH in Bacillus amyloliquefaciens SQR9 positively regulates its transcription in the presence of the precursor tryptophan. When the dhaS gene was mutated, the yield of IAA produced was only 23% of that produced by the wild-type (WT) strain. However, when the gene was heterologously expressed with other genes from the IPyA pathway, the yield of IAA increased up to 180% of that produced by the WT strain [59].

It is important to note that the COG1012 and MSR-1-like families do not share homology with proteins that have been experimentally tested. However, they have a level of identity with DhaS and ALDH2 families that is relatively close to 40%, which is the threshold used to establish a family. Given the promiscuity observed among phylogenetically related ALDH families, it is plausible that the COG1012 and MSR-1-like families may be able to metabolize substrates to those metabolized by the DhaS and ALDH2 families, although their efficiency may vary, as is the case with AldA, AldB, and AldC of P. syringae DC3000 [58,86]. It is worth noting that aldA from A. brasilense Sp7 belongs to the ALDH2 family and has been shown to convert aldehyde into acetate [22]. It is known that some ALDH families are promiscuous in their substrate specificity [94], which suggests that the ALDH2, COG1012, DhaS, and MSR1-like families could be involved in the metabolism of aldehyde, acetaldehyde, and indole acetaldehyde, with different yields for each substrate type.

α-KGSALDHs have been studied in A. brasilense, where they play a crucial role in arabinose metabolism by converting the intermediate α-KGSA into α-ketoglutarate. Three α-KGSALDHs were examined in these studies and classified into two families: α-KGSALDH I in one family, and α-KGSALDH-II and α-KGSALDH-III in another independent family, with high identity between II and III [51,95]. Our findings indicate that the α-KGSALDHs of Azospirillum are predominantly grouped into two distinct families, except for the protein QCG99333.1 of A. sp TSA2s, which is grouped independently. When comparing our sequences to the α-KGSALDH II and III sequences (accession numbers AB275768 and AB275769), they would be grouped into the larger α-KGSALDH group, while α-KGSALDH-I shows greater similarity to members of the ALDH5 family. Therefore, the second group of α-KGSALDH from Azospirillum lacks experimentally proven α-KGSALDHs, but they exhibit good grouping in this family in the CDD platform classification, with an E-value of 0.0. This group is related to YcbD, an α-KGSALDH from B. subtilis that is part of an operon with the ability to metabolize D-glucarate/galactarate [96].

The PaaZ family was identified in species lacking the ipdC gene, which plays a crucial role in the IPyA pathway of IAA biosynthesis [91,93]. Azospirillum species, known for their beneficial effects associated with IAA synthesis, possess the ipdC gene [17,97]. Upon examining the genetic context of the strains where paaZ was found, it was discovered that it forms part of a catabolon with genes associated with the degradation of phenylacetic acid and possibly IAA (paaABCDE, paaZ or paaN, paaH, paaG, paaI, paaJ, and paaK). This suggests that strains lacking the IPyA pathway but possessing the Paa pathway may have the ability to use IAA or PAA as carbon sources [11]. This aligns with previous reports associating the beneficial effects of strains like A. oryzae with beneficial effects other than IAA synthesis, such as nitrogen fixation [98], or in A. ramasamyi with a negative indole production [44]. Therefore, PaaZ would have a dual function as oxepin-CoA hydrolase/3-oxo-5,6-dehydrosuberyl-CoA semialdehyde dehydrogenase, as previously reported in Pseudomonas putida F1 [99].

This corroborates the presence of two distinct groups within the Azospirillum genus. The first group confers a beneficial effect on plants through the production of IAA in the IPyA pathway. On the other hand, the second group, which possesses genes necessary for a metabolic pathway that can degrade this phytohormone, elicits beneficial effects that are distinct from IAA biosynthesis.

Analysis of all examined genomes revealed a solitary copy of the ALDH16 family. Comparison of Azospirillum and Sinorhizobium meliloti members showed a shared identity of approximately 67%, indicating that they belong to the same subfamily and likely possess similar functions. These ALDH16 types have a C-terminal domain with a duplicated Rossman fold, which does not bind to NAD like the Rossman fold of the ALDH domain. However, the function of this region remains unknown. [73].

Our analysis also revealed that the ALDH20 family from Azospirillum also has a second domain (ADH). Interestingly, we identified a sequence in the A. humicireducens SgZ-5 strain that shares high homology with the ALDH domain of the ALDH20 family. This sequence consists of 396 amino acids and has the locus A6A40_RS31455 and a protein ID WP_236783856.1. Additionally, we found another gene from the ALDH20 family in this strain that contains both domains and is 899 amino acids long. This gene has a locus A6A40_RS21750 and the protein ID of WP_108547921.1. the two genes share 99% sequence identity (sequences are available in Supplementary Material CURATED_DATABASE.xls). However, because we did not find any homologs of the short sequence with a similar size in the NCBI database, we suspect that partial duplication of the long sequence may have occurred. Further testing is needed to determine the functional capacity of this sequence. Although the catalytic cysteine remains intact, the oligomerization domain may be affected due to its incompleteness.

Furthermore, the presence of both ALDH20 and ALDH2 families in all analyzed Azospirillum strains suggests that these bacteria have pathways to produce ethanol through anaerobic fermentation mediated by ALDH20-formed spirosomes as in E. coli [49,76]. Ethanol degradation is mediated by pyruvate decarboxylases and ALDH2. These findings provide valuable insights into the metabolic capabilities of Azospirillum, and may have implications for future research in this area [22].

The ALDH19 family in eukaryotes is characterized by bifunctional proteins, but in Azospirillum, only the ALDH domain appears to operate independently, catalyzing aldehyde phosphorylation. These proteins have four beta sheets, unlike other ALDH families, which have five. Additionally, the catalytic cysteine is activated by a residue other than glutamic acid, although the mechanism of this process has not been documented in the current literature [64,71,77].

Our study found that the ALDH19 family exhibits low sequence variation (Table S1), with only one strain showing a deviation from the chromosomal location (Figure 8). We tested the utility of ALDH19 as a phylogenetic marker for the Azospirillum genus by comparing it to the rpoD marker used in previous studies [36]. Our results showed that ALDH19 sequences have nearly five times less variation than RpoD sequences. Additionally, a phylogenetic tree constructed using ALDH19 sequences better retained current species classification, indicating that ALDH19 can be a reliable tool for organizing new strains of the Azospirillum genus for which complete genomic analysis is not possible. The diversity of ALDHs in Azospirillum bacteria can affect their ability to produce IAA and other beneficial compounds that promote plant growth. Therefore, understanding the genetic diversity of ALDHs in Azospirillum can help identify strains that are more effective in promoting plant growth and can be used as biofertilizers to enhance agricultural productivity.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/d15121178/s1, Figure S1. Subunit structure of Pseudomonas syringae DC3000 indole-3-acetaldehyde dehydrogenase (5IUW). Schematic representation of the structure of a subunit of AldA from P. syringae with bound NAD (spheres representation), showing alpha helices in green, beta sheets in blue, and loops in yellow. A close-up of the Rossmann fold, formed by five beta sheets, is shown in gray square box. Figure was produced using ChimeraX software. Figure S2. Domain organization of P. syringae DC3000 AldA monomer (5IUW) [58]. AldA structural subunit is color coded to distinguish the oligomerization domain (blue), catalytic domain (purple), NAD(P)⁺ binding domain (cyan). Figure S3. Pangenome analysis of 17 Azospirillum genomes using Anvi´o software revealed that the identified ALDHs are distributed among 33 Clusters of Orthologous Genes (COG’s). Table S1. Identity matrix of global alignment in supplementary file “aldh_identity_alignment_results.xlsx”. Table S2. Structure- based comparison of aldehyde dehydrogenase models: amino acid sequence information.

Author Contributions

Conceptualization, R.C.-O. and A.R.-M.; methodology, R.C.-O., A.R.-M. and C.M.-S.; software, R.C.-O., S.R.R.-C., C.M.-S. and A.R.-M.; validation, B.E.B., R.C.-O. and A.R.-M.; formal analysis, R.C.-O., B.E.B. and A.R.-M.; investigation, R.C.-O., M.L.X.-V. and B.E.B.; resources, B.E.B. and A.R.-M.; data curation, R.C.-O., C.M.-S., S.R.R.-C. and M.L.X.-V.; writing—original draft preparation, R.C.-O.; writing—review and editing, A.R.-M., B.E.B. and C.M.-S.; visualization, R.C.-O. and A.R.-M.; supervision, S.R.R.-C., C.M.-S., M.L.X.-V. and A.R.-M.; project administration, M.L.X.-V., B.E.B. and A.R.-M.; funding acquisition, B.E.B. and A.R.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by VIEP-BUAP to A.R.-M., grant number 00171, and the APC was not funded. R.C.-O. (fellowship number 744837) received a Ph.D. fellowship from CONAHCYT-Mexico.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Genomic data was obtained from the NCBI database. The strains and their corresponding RefSeq accession numbers are listed below: Azospirillum baldaniorum Sp245, GCF_003119195.2; Azospirillum brasilense Sp 7, GCF_008274945.1; Azospirillum brasilense MTCC4039, GCF_005222205.1; Azospirillum brasilense MTCC4038, GCF_005222145.1; Azospirillum brasilense Cd, GCF_008274965.1; Azospirillum lipoferum 4B, GCF_000283655.1; Azospirillum oryzae KACC 14407, GCF_013347285.1; Azospirillum ramasamyi M2T2B2, GCF_003233655.1; Azospirillum sp. B510, GCF_000010725.1; Azospirillum sp. TSA2s, GCF_004923315.1; Azospirillum sp. TSH100, GCF_004923295.1; Azospirillum sp. TSH58, GCF_003119115.1; Azospirillum thermophilum CFH 70021, GCF_003130795.1; Azospirillum thiophilum BV-S, GCF_001305595.1.

Acknowledgments

We acknowledge the support of the Laboratorio Nacional de Supercómputo del Sureste de México (LNS) for providing supercomputing resources under grant 202101037C.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kuykendall, J.R.; Kuykendall, N.S. 15.19—Aldehydes. In Comprehensive Toxicology, 3rd ed.; McQueen, C.A., Ed.; Elsevier: Oxford, UK, 2018; pp. 352–388. ISBN 978-0-08-100601-6. [Google Scholar]
Deza-Ponzio, R.; Herrera, M.L.; Bellini, M.J.; Virgolini, M.B.; Hereñú, C.B. Aldehyde Dehydrogenase 2 in the Spotlight: The Link between Mitochondria and Neurodegeneration. Neurotoxicology 2018, 68, 19–24. [Google Scholar] [CrossRef]
Muzio, G.; Maggiora, M.; Paiuzzi, E.; Oraldi, M.; Canuto, R.A. Aldehyde Dehydrogenases and Cell Proliferation. Free. Radic. Biol. Med. 2012, 52, 735–746. [Google Scholar] [CrossRef] [PubMed]
Shortall, K.; Djeghader, A.; Magner, E.; Soulimane, T. Insights into Aldehyde Dehydrogenase Enzymes: A Structural Perspective. Front. Mol. Biosci. 2021, 8, 659550. [Google Scholar] [CrossRef]
Nosova, T.; Jokelainen, K.; Kaihovaara, P.; Jousimies-Somer, H.; Siitonen, A.; Heine, R.; Salaspuro, M. Aldehyde Dehydrogenase Activity and Acetate Production by Aerobic Bacteria Representing the Normal Flora of Human Large Intestine. Alcohol Alcohol 1996, 31, 555–564. [Google Scholar] [CrossRef]
Tola, A.J.; Jaballi, A.; Germain, H.; Missihoun, T.D. Recent Development on Plant Aldehyde Dehydrogenase Enzymes and Their Functions in Plant Development and Stress Signaling. Genes 2021, 12, 51. [Google Scholar] [CrossRef]
Vasiliou, V.; Nebert, D.W. Analysis and Update of the Human Aldehyde Dehydrogenase (ALDH) Gene Family. Hum Genom. 2005, 2, 138–143. [Google Scholar] [CrossRef]
Riveros-Rosas, H.; Julián-Sánchez, A.; Moreno-Hagelsieb, G.; Muñoz-Clares, R.A. Aldehyde Dehydrogenase Diversity in Bacteria of the Pseudomonas Genus. Chem.-Biol. Interact. 2019, 304, 83–87. [Google Scholar] [CrossRef] [PubMed]
Liu, W.-H.; Chen, F.-F.; Wang, C.-E.; Fu, H.-H.; Fang, X.-Q.; Ye, J.-R.; Shi, J.-Y. Indole-3-Acetic Acid in Burkholderia Pyrrocinia JK-SH007: Enzymatic Identification of the Indole-3-Acetamide Synthesis Pathway. Front. Microbiol. 2019, 10, 2559. [Google Scholar] [CrossRef]
Kim, Y.-G.; Lee, S.; Kwon, O.-S.; Park, S.-Y.; Lee, S.-J.; Park, B.-J.; Kim, K.-J. Redox-Switch Modulation of Human SSADH by Dynamic Catalytic Loop. EMBO J. 2009, 28, 959–968. [Google Scholar] [CrossRef] [PubMed]
Ferrández, A.; García, J.L.; Díaz, E. Transcriptional Regulation of the Divergent paaCatabolic Operons for Phenylacetic Acid Degradation in Escherichia coli. J. Biol. Chem. 2000, 275, 12214–12222. [Google Scholar] [CrossRef]
Koppaka, V.; Thompson, D.C.; Chen, Y.; Ellermann, M.; Nicolaou, K.C.; Juvonen, R.O.; Petersen, D.; Deitrich, R.A.; Hurley, T.D.; Vasiliou, V. Aldehyde Dehydrogenase Inhibitors: A Comprehensive Review of the Pharmacology, Mechanism of Action, Substrate Specificity, and Clinical Application. Pharmacol. Rev. 2012, 64, 520–539. [Google Scholar] [CrossRef] [PubMed]
Hempel, J.; Nicholas, H.; Lindahl, R. Aldehyde Dehydrogenases: Widespread Structural and Functional Diversity within a Shared Framework. Protein Sci. 1993, 2, 1890–1900. [Google Scholar] [CrossRef]
Brocker, C.; Vasiliou, M.; Carpenter, S.; Carpenter, C.; Zhang, Y.; Wang, X.; Kotchoni, S.O.; Wood, A.J.; Kirch, H.-H.; Kopečný, D.; et al. Aldehyde Dehydrogenase (ALDH) Superfamily in Plants: Gene Nomenclature and Comparative Genomics. Planta 2013, 237, 189–210. [Google Scholar] [CrossRef] [PubMed]
Nebert, D.W.; Wain, H.M. Update on Human Genome Completion and Annotations: Gene Nomenclature. Hum. Genom. 2003, 1, 66–71. [Google Scholar] [CrossRef] [PubMed][Green Version]
Wisniewski-Dyé, F.; Borziak, K.; Khalsa-Moyers, G.; Alexandre, G.; Sukharnikov, L.O.; Wuichet, K.; Hurst, G.B.; McDonald, W.H.; Robertson, J.S.; Barbe, V.; et al. Azospirillum Genomes Reveal Transition of Bacteria from Aquatic to Terrestrial Environments. PLoS Genet. 2011, 7, e1002430. [Google Scholar] [CrossRef]
Pedraza, R.O.; Filippone, M.P.; Fontana, C.; Salazar, S.M.; Ramírez-Mata, A.; Sierra-Cacho, D.; Baca, B.E. Chapter 6—Azospirillum. In Beneficial Microbes in Agro-Ecology; Amaresan, N., Senthil Kumar, M., Annapurna, K., Kumar, K., Sankaranarayanan, A., Eds.; Academic Press: Cambridge, MA, USA, 2020; pp. 73–105. ISBN 978-0-12-823414-3. [Google Scholar]
Reis, V.M.; Baldani, V.L.D.; Baldani, J.I. Isolation, Identification and Biochemical Characterization of Azospirillum Spp. and Other Nitrogen-Fixing Bacteria. In Handbook for Azospirillum: Technical Issues and Protocols; Cassán, F.D., Okon, Y., Creus, C.M., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 3–26. ISBN 978-3-319-06542-7. [Google Scholar]
Cassán, F.; López, G.; Nievas, S.; Coniglio, A.; Torres, D.; Donadio, F.; Molina, R.; Mora, V. What Do We Know About the Publications Related with Azospirillum? A Metadata Analysis. Microb. Ecol. 2021, 81, 278–281. [Google Scholar] [CrossRef]
Datasets—NCBI—NIH. Available online: https://www.ncbi.nlm.nih.gov/datasets/genomes/?taxon=191&utm_source=data-hub (accessed on 3 February 2022).
McGinnis, S.; Madden, T.L. BLAST: At the Core of a Powerful and Diverse Set of Sequence Analysis Tools. Nucleic Acids Res 2004, 32, W20–W25. [Google Scholar] [CrossRef]
Singh, V.S.; Dubey, B.K.; Pandey, P.; Rai, S.; Tripathi, A.K. Co-Metabolism of Ethanol in Azospirillum brasilense Sp7 Is Mediated by Fructose and Glycerol and Regulated Negatively by an Alternative Sigma Factor RpoH2. J. Bacteriol. 2021, 203, JB0026921. [Google Scholar] [CrossRef]
Kerfeld, C.A.; Scott, K.M. Using BLAST to Teach “E-Value-Tionary” Concepts. PLoS Biol. 2011, 9, e1001014. [Google Scholar] [CrossRef]
Eren, A.M.; Esen, Ö.C.; Quince, C.; Vineis, J.H.; Morrison, H.G.; Sogin, M.L.; Delmont, T.O. Anvi’o: An Advanced Analysis and Visualization Platform for ’omics Data. PeerJ 2015, 3, e1319. [Google Scholar] [CrossRef]
Marchler-Bauer, A.; Derbyshire, M.K.; Gonzales, N.R.; Lu, S.; Chitsaz, F.; Geer, L.Y.; Geer, R.C.; He, J.; Gwadz, M.; Hurwitz, D.I.; et al. CDD: NCBI’s Conserved Domain Database. Nucleic Acids Res. 2015, 43, D222-226. [Google Scholar] [CrossRef] [PubMed]
Marchler-Bauer, A.; Lu, S.; Anderson, J.B.; Chitsaz, F.; Derbyshire, M.K.; DeWeese-Scott, C.; Fong, J.H.; Geer, L.Y.; Geer, R.C.; Gonzales, N.R.; et al. CDD: A Conserved Domain Database for the Functional Annotation of Proteins. Nucleic Acids Res. 2011, 39, D225–D229. [Google Scholar] [CrossRef] [PubMed]
Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef]
Edgar, R.C. MUSCLE: Multiple Sequence Alignment with High Accuracy and High Throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef]
Okonechnikov, K.; Golosova, O.; Fursov, M. Unipro UGENE: A Unified Bioinformatics Toolkit. Bioinformatics 2012, 28, 1166–1167. [Google Scholar] [CrossRef] [PubMed]
Posada, D. jModelTest: Phylogenetic Model Averaging. Mol. Biol. Evol. 2008, 25, 1253–1256. [Google Scholar] [CrossRef]
Le, S.Q.; Gascuel, O. An Improved General Amino Acid Replacement Matrix. Mol. Biol. Evol. 2008, 25, 1307–1320. [Google Scholar] [CrossRef]
UniProt: The Universal Protein Knowledgebase—PMC. Available online: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210571/ (accessed on 30 August 2023).
Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
Pettersen, E.F.; Goddard, T.D.; Huang, C.C.; Meng, E.C.; Couch, G.S.; Croll, T.I.; Morris, J.H.; Ferrin, T.E. UCSF ChimeraX: Structure Visualization for Researchers, Educators, and Developers. Protein Sci. 2021, 30, 70–82. [Google Scholar] [CrossRef]
Mauri, M.; Elli, T.; Caviglia, G.; Uboldi, G.; Azzi, M. RAWGraphs: A Visualisation Platform to Create Open Outputs. In Proceedings of the Proceedings of the 12th Biannual Conference on Italian SIGCHI Chapter, Cagliari, Italy, 18–20 September 2017; Association for Computing Machinery: New York, NY, USA, 2017; pp. 1–5. [Google Scholar]
Maroniche, G.A.; García, J.E.; Salcedo, F.; Creus, C.M. Molecular Identification of Azospirillum Spp.: Limitations of 16S rRNA and Qualities of rpoD as Genetic Markers. Microbiol. Res. 2017, 195, 1–10. [Google Scholar] [CrossRef]
Tripathi, A.K.; Singh, C. Azospirillum brasilense Genome Assembly ASM131501v1. Available online: https://www.ncbi.nlm.nih.gov/data-hub/assembly/GCF_001315015.1/ (accessed on 21 October 2022).
Singh, C.; Pandey, P.; Singh, D.N.; Pandey, R.; Shasany, A.K.; Tripathi, A.K. Whole-Genome Sequences of Four Indian Isolates of Azospirillum brasilense. Microbiol. Resour. Announc. 2019, 8, e00633-19. [Google Scholar] [CrossRef] [PubMed]
Jang, J.; Sakai, Y.; Senoo, K.; Ishii, S. Potentially Mobile Denitrification Genes Identified in Azospirillum Sp. Strain TSH58. Appl. Environ. Microbiol. 2019, 85, e02474-18. [Google Scholar] [CrossRef] [PubMed]
Rivera, D.; Revale, S.; Molina, R.; Gualpa, J.; Puente, M.; Maroniche, G.; Paris, G.; Baker, D.; Clavijo, B.; McLay, K.; et al. Complete Genome Sequence of the Model Rhizosphere Strain Azospirillum brasilense Az39, Successfully Applied in Agriculture. Genome Announc. 2014, 2, e00683-14. [Google Scholar] [CrossRef] [PubMed]
Tripathi, A.K.; Singh, C. ASM827496v1—Genome—Assembly—NCBI. Available online: https://www.ncbi.nlm.nih.gov/assembly/GCF_008274965.1/?shouldredirect=false (accessed on 21 July 2022).
Zhou, S.; Han, L.; Wang, Y.; Yang, G.; Zhuang, L.; Hu, P. Azospirillum humicireducens Sp. Nov., a Nitrogen-Fixing Bacterium Isolated from a Microbial Fuel Cell. Int. J. Syst. Evol. Microbiol. 2013, 63, 2618–2624. [Google Scholar] [CrossRef] [PubMed]
Kim, M.; Park, Y.-J.; Shin, J.-H. ASM1334728v1—Genome—Assembly—NCBI. Available online: https://www.ncbi.nlm.nih.gov/assembly/GCF_013347285.1/ (accessed on 21 November 2023).
Anandham, R.; Heo, J.; Krishnamoorthy, R.; SenthilKumar, M.; Gopal, N.O.; Kim, S.-J.; Kwon, S.-W. Azospirillum ramasamyi Sp. Nov., a Novel Diazotrophic Bacterium Isolated from Fermented Bovine Products. Int. J. Syst. Evol. Microbiol. 2019, 69, 1369–1375. [Google Scholar] [CrossRef]
Kaneko, T.; Minamisawa, K.; Isawa, T.; Nakatsukasa, H.; Mitsui, H.; Kawaharada, Y.; Nakamura, Y.; Watanabe, A.; Kawashima, K.; Ono, A.; et al. Complete Genomic Structure of the Cultivated Rice Endophyte Azospirillum sp B510. DNA Res. 2010, 17, 37–50. [Google Scholar] [CrossRef]
Gao, N.; Shen, W.; Nishizawa, T.; Isobe, K.; Guo, Y.; Ying, H.; Senoo, K. Genome Sequences of Two Azospirillum sp. Strains, TSA2S and TSH100, Plant Growth-Promoting Rhizobacteria with N2O Mitigation Abilities. Microbiol. Resour. Announc. 2019, 8, e00459-19. [Google Scholar] [CrossRef] [PubMed]
Zhao, Z.; Ming, H.; Ding, C.-L.; Ji, W.-L.; Cheng, L.-J.; Niu, M.; Zhang, Y.; Zhang, L.-Y.; Meng, X.-L.; Nie, G.-X. Azospirillum thermophilum sp. Nov., Isolated from a Hot Spring. Int. J. Syst. Evol. Microbiol. 2020, 70, 550–554. [Google Scholar] [CrossRef] [PubMed]
Fomenkov, A.; Vincze, T.; Grabovich, M.; Anton, B.P.; Dubinina, G.; Orlova, M.; Belousova, E.; Roberts, R.J. Complete Genome Sequence of a Strain of Azospirillum thiophilum Isolated from a Sulfide Spring. Genome Announc. 2016, 4, e01521-15. [Google Scholar] [CrossRef]
Pony, P.; Rapisarda, C.; Terradot, L.; Marza, E.; Fronzes, R. Filamentation of the Bacterial Bi-Functional Alcohol/Aldehyde Dehydrogenase AdhE Is Essential for Substrate Channeling and Enzymatic Regulation. Nat. Commun. 2020, 11, 1426. [Google Scholar] [CrossRef]
Tretter, L.; Adam-Vizi, V. Alpha-Ketoglutarate Dehydrogenase: A Target and Generator of Oxidative Stress. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2005, 360, 2335–2345. [Google Scholar] [CrossRef]
Watanabe, S.; Yamada, M.; Ohtsu, I.; Makino, K. α-Ketoglutaric Semialdehyde Dehydrogenase Isozymes Involved in Metabolic Pathways of D-Glucarate, D-Galactarate, and Hydroxy-L-Proline: Molecular and Metabolic Convergent Evolution. J. Biol. Chem. 2007, 282, 6685–6695. [Google Scholar] [CrossRef] [PubMed]
Malaspina, P.; Picklo, M.J.; Jakobs, C.; Snead, O.C.; Gibson, K.M. Comparative Genomics of Aldehyde Dehydrogenase 5a1 (Succinate Semialdehyde Dehydrogenase) and Accumulation of Gamma-Hydroxybutyrate Associated with Its Deficiency. Hum. Genom. 2009, 3, 106–120. [Google Scholar] [CrossRef]
Talfournier, F.; Stines-Chaumeil, C.; Branlant, G. Methylmalonate-Semialdehyde Dehydrogenase from Bacillus subtilis: Substrate Specificity and Coenzyme A Binding. J. Biol. Chem. 2011, 286, 21971–21981. [Google Scholar] [CrossRef]
Steele, M.I.; Lorenz, D.; Hatter, K.; Park, A.; Sokatch, J.R. Characterization of the mmsAB Operon of Pseudomonas aeruginosa PAO Encoding Methylmalonate-Semialdehyde Dehydrogenase and 3-Hydroxyisobutyrate Dehydrogenase. J. Biol. Chem. 1992, 267, 13585–13592. [Google Scholar] [CrossRef] [PubMed]
Kostichka, K.; Thomas, S.M.; Gibson, K.J.; Nagarajan, V.; Cheng, Q. Cloning and Characterization of a Gene Cluster for Cyclododecanone Oxidation in Rhodococcus ruber SC1. J. Bacteriol. 2001, 183, 6478–6486. [Google Scholar] [CrossRef]
Membrillo-Hernández, J.; Echave, P.; Cabiscol, E.; Tamarit, J.; Ros, J.; Lin, E.C.C. Evolution of the adhE Gene Product of Escherichia coli from a Functional Reductase to a Dehydrogenase: GENETIC AND BIOCHEMICAL STUDIES OF THE MUTANT PROTEINS. J. Biol. Chem. 2000, 275, 33869–33875. [Google Scholar] [CrossRef]
Richter, M.; Kube, M.; Bazylinski, D.A.; Lombardot, T.; Glöckner, F.O.; Reinhardt, R.; Schüler, D. Comparative Genome Analysis of Four Magnetotactic Bacteria Reveals a Complex Set of Group-Specific Genes Implicated in Magnetosome Biomineralization and Function. J. Bacteriol. 2007, 189, 4899–4910. [Google Scholar] [CrossRef]
McClerklin, S.A.; Lee, S.G.; Harper, C.P.; Nwumeh, R.; Jez, J.M.; Kunkel, B.N. Indole-3-Acetaldehyde Dehydrogenase-Dependent Auxin Synthesis Contributes to Virulence of Pseudomonas syringae Strain DC3000. PLoS Pathog. 2018, 14, e1006811. [Google Scholar] [CrossRef] [PubMed]
Shao, J.; Li, S.; Zhang, N.; Cui, X.; Zhou, X.; Zhang, G.; Shen, Q.; Zhang, R. Analysis and Cloning of the Synthetic Pathway of the Phytohormone Indole-3-Acetic Acid in the Plant-Beneficial Bacillus amyloliquefaciens SQR9. Microb. Cell Fact. 2015, 14, 130. [Google Scholar] [CrossRef]
Xie, B.; Xu, K.; Zhao, H.X.; Chen, S.F. Isolation of Transposon Mutants from Azospirillum brasilense Yu62 and Characterization of Genes Involved in Indole-3-Acetic Acid Biosynthesis. FEMS Microbiol. Lett. 2005, 248, 57–63. [Google Scholar] [CrossRef] [PubMed]
Malinich, E.A.; Bauer, C.E. Transcriptome Analysis of Azospirillum brasilense Vegetative and Cyst States Reveals Large-Scale Alterations in Metabolic and Replicative Gene Expression. Microb. Genom. 2018, 4, e000200. [Google Scholar] [CrossRef] [PubMed]
Chen, S.; White, C.E.; diCenzo, G.C.; Zhang, Y.; Stogios, P.J.; Savchenko, A.; Finan, T.M. L-Hydroxyproline and d-Proline Catabolism in Sinorhizobium meliloti. J. Bacteriol. 2016, 198, 1171–1181. [Google Scholar] [CrossRef]
Korasick, D.A.; Končitíková, R.; Kopečná, M.; Hájková, E.; Vigouroux, A.; Moréra, S.; Becker, D.F.; Šebela, M.; Tanner, J.J.; Kopečný, D. Structural and Biochemical Characterization of Aldehyde Dehydrogenase 12, the Last Enzyme of Proline Catabolism in Plants. J. Mol. Biol. 2019, 431, 576–592. [Google Scholar] [CrossRef]
Vasiliou, V.; Bairoch, A.; Tipton, K.F.; Nebert, D.W. Eukaryotic Aldehyde Dehydrogenase (ALDH) Genes: Human Polymorphisms, and Recommended Nomenclature Based on Divergent Evolution and Chromosomal Mapping. Pharmacogenetics 1999, 9, 421–434. [Google Scholar]
Končitíková, R.; Vigouroux, A.; Kopečná, M.; Andree, T.; Bartoš, J.; Šebela, M.; Moréra, S.; Kopečný, D. Role and Structural Characterization of Plant Aldehyde Dehydrogenases from Family 2 and Family 7. Biochem. J. 2015, 468, 109–123. [Google Scholar] [CrossRef]
Sathyanarayanan, N.; Cannone, G.; Gakhar, L.; Katagihallimath, N.; Sowdhamini, R.; Ramaswamy, S.; Vinothkumar, K.R. Molecular Basis for Metabolite Channeling in a Ring Opening Enzyme of the Phenylacetate Degradation Pathway. Nat. Commun. 2019, 10, 4127. [Google Scholar] [CrossRef] [PubMed]
Denome, S.A.; Stanley, D.C.; Olson, E.S.; Young, K.D. Metabolism of Dibenzothiophene and Naphthalene in Pseudomonas Strains: Complete DNA Sequence of an Upper Naphthalene Catabolic Pathway. J. Bacteriol. 1993, 175, 6890–6901. [Google Scholar] [CrossRef] [PubMed]
Nichols, N.N.; Mertens, J.A. Identification and Transcriptional Profiling of Pseudomonas putida Genes Involved in Furoic Acid Metabolism. FEMS Microbiol. Lett. 2008, 284, 52–57. [Google Scholar] [CrossRef] [PubMed]
Gillooly, D.J.; Robertson, A.G.; Fewson, C.A. Molecular Characterization of Benzyl Alcohol Dehydrogenase and Benzaldehyde Dehydrogenase II of Acinetobacter calcoaceticus. Biochem. J. 1998, 330 Pt 3, 1375–1381. [Google Scholar] [CrossRef]
Krishna, R.V.; Beilstein, P.; Leisinger, T. Biosynthesis of Proline in Pseudomonas aeruginosa. Properties of Gamma-Glutamyl Phosphate Reductase and 1-Pyrroline-5-Carboxylate Reductase. Biochem. J. 1979, 181, 223–230. [Google Scholar] [CrossRef]
Cherney, L.T.; Cherney, M.M.; Garen, C.R.; Niu, C.; Moradian, F.; James, M.N.G. Crystal Structure of N-Acetyl-Gamma-Glutamyl-Phosphate Reductase from Mycobacterium tuberculosis in Complex with NADP(+). J. Mol. Biol. 2007, 367, 1357–1369. [Google Scholar] [CrossRef]
Brocker, C.; Lassen, N.; Estey, T.; Pappa, A.; Cantore, M.; Orlova, V.V.; Chavakis, T.; Kavanagh, K.L.; Oppermann, U.; Vasiliou, V. Aldehyde Dehydrogenase 7A1 (ALDH7A1) Is a Novel Enzyme Involved in Cellular Defense against Hyperosmotic Stress. J. Biol. Chem. 2010, 285, 18452–18463. [Google Scholar] [CrossRef]
Liu, L.-K.; Becker, D.F.; Tanner, J.J. Structure, Function, and Mechanism of Proline Utilization A (PutA). Arch. Biochem. Biophys. 2017, 632, 142–157. [Google Scholar] [CrossRef]
Hou, Q.; Bartels, D. Comparative Study of the Aldehyde Dehydrogenase (ALDH) Gene Superfamily in the Glycophyte Arabidopsis thaliana and Eutrema halophytes. Ann. Bot. 2015, 115, 465–479. [Google Scholar] [CrossRef] [PubMed]
van Lis, R.; Popek, M.; Couté, Y.; Kosta, A.; Drapier, D.; Nitschke, W.; Atteia, A. Concerted Up-Regulation of Aldehyde/Alcohol Dehydrogenase (ADHE) and Starch in Chlamydomonas reinhardtii Increases Survival under Dark Anoxia. J. Biol. Chem. 2017, 292, 2395–2410. [Google Scholar] [CrossRef] [PubMed]
Kim, G.; Yang, J.; Jang, J.; Choi, J.-S.; Roe, A.J.; Byron, O.; Seok, C.; Song, J.-J. Aldehyde-Alcohol Dehydrogenase Undergoes Structural Transition to Form Extended Spirosomes for Substrate Channeling. Commun. Biol. 2020, 3, 1–9. [Google Scholar] [CrossRef] [PubMed]
Page, R.; Nelson, M.S.; von Delft, F.; Elsliger, M.-A.; Canaves, J.M.; Brinen, L.S.; Dai, X.; Deacon, A.M.; Floyd, R.; Godzik, A.; et al. Crystal Structure of Gamma-Glutamyl Phosphate Reductase (TM0293) from Thermotoga maritima at 2.0 A Resolution. Proteins 2004, 54, 157–161. [Google Scholar] [CrossRef] [PubMed]
Baugh, L.; Gallagher, L.A.; Patrapuvich, R.; Clifton, M.C.; Gardberg, A.S.; Edwards, T.E.; Armour, B.; Begley, D.W.; Dieterich, S.H.; Dranow, D.M.; et al. Combining Functional and Structural Genomics to Sample the Essential Burkholderia structome. PLoS ONE 2013, 8, e53851. [Google Scholar] [CrossRef] [PubMed]
Alizadeh Behbahani, B.; Jooyandeh, H.; Falah, F.; Vasiee, A. Gamma-aminobutyric Acid Production by Lactobacillus brevis A3: Optimization of Production, Antioxidant Potential, Cell Toxicity, and Antimicrobial Activity. Food Sci. Nutr. 2020, 8, 5330–5339. [Google Scholar] [CrossRef]
Chen, C.-H.; Joshi, A.U.; Mochly-Rosen, D. The Role of Mitochondrial Aldehyde Dehydrogenase 2 (ALDH2) in Neuropathology and Neurodegeneration. Acta Neurol. Taiwan 2016, 25, 111–123. [Google Scholar]
Harrison, P.W.; Lower, R.P.J.; Kim, N.K.D.; Young, J.P.W. Introducing the Bacterial ‘Chromid’: Not a Chromosome, Not a Plasmid. Trends Microbiol. 2010, 18, 141–148. [Google Scholar] [CrossRef]
Fritz, K.S.; Petersen, D.R. An Overview of the Chemistry and Biology of Reactive Aldehydes. Free Radic. Biol. Med. 2013, 59, 85–91. [Google Scholar] [CrossRef] [PubMed]
Islam, M.S.; Ghosh, A. Evolution, Family Expansion, and Functional Diversification of Plant Aldehyde Dehydrogenases. Gene 2022, 829, 146522. [Google Scholar] [CrossRef]
Schultz, J.; Copley, R.R.; Doerks, T.; Ponting, C.P.; Bork, P. SMART: A Web-Based Tool for the Study of Genetically Mobile Domains. Nucleic Acids Res. 2000, 28, 231–234. [Google Scholar] [CrossRef]
Langendorf, C.G.; Key, T.L.G.; Fenalti, G.; Kan, W.-T.; Buckle, A.M.; Caradoc-Davies, T.; Tuck, K.L.; Law, R.H.P.; Whisstock, J.C. The X-Ray Crystal Structure of Escherichia coli Succinic Semialdehyde Dehydrogenase; Structural Insights into NADP+/Enzyme Interactions. PLoS ONE 2010, 5, e9280. [Google Scholar] [CrossRef]
Lee, S.G.; Harline, K.; Abar, O.; Akadri, S.O.; Bastian, A.G.; Chen, H.-Y.S.; Duan, M.; Focht, C.M.; Groziak, A.R.; Kao, J.; et al. The Plant Pathogen Enzyme AldC Is a Long-Chain Aliphatic Aldehyde Dehydrogenase. J. Biol. Chem. 2020, 295, 13914–13926. [Google Scholar] [CrossRef]
Baburam, C.; Feto, N.A. Mining of Two Novel Aldehyde Dehydrogenases (DHY-SC-VUT5 and DHY-G-VUT7) from Metagenome of Hydrocarbon Contaminated Soils. BMC Biotechnol. 2021, 21, 18. [Google Scholar] [CrossRef]
Cook, S.D.; Nichols, D.S.; Smith, J.; Chourey, P.S.; McAdam, E.L.; Quittenden, L.; Ross, J.J. Auxin Biosynthesis: Are the Indole-3-Acetic Acid and Phenylacetic Acid Biosynthesis Pathways Mirror Images? Plant Physiol. 2016, 171, 1230–1241. [Google Scholar] [CrossRef]
Cook, S.D. An Historical Review of Phenylacetic Acid. Plant Cell Physiol. 2019, 60, 243–254. [Google Scholar] [CrossRef] [PubMed]
Baudoin, E.; Lerner, A.; Mirza, M.S.; El Zemrany, H.; Prigent-Combaret, C.; Jurkevich, E.; Spaepen, S.; Vanderleyden, J.; Nazaret, S.; Okon, Y.; et al. Effects of Azospirillum brasilense with Genetically Modified Auxin Biosynthesis Gene ipdC upon the Diversity of the Indigenous Microbiota of the Wheat Rhizosphere. Res. Microbiol. 2010, 161, 219–226. [Google Scholar] [CrossRef] [PubMed]
Jijón-Moreno, S.; Marcos-Jiménez, C.; Pedraza, R.O.; Ramírez-Mata, A.; de Salamone, I.G.; Fernández-Scavino, A.; Vásquez-Hernández, C.A.; Soto-Urzúa, L.; Baca, B.E. The ipdC, hisC1 and hisC2 Genes Involved in Indole-3-Acetic Production Used as Alternative Phylogenetic Markers in Azospirillum brasilense. Antonie van Leeuwenhoek 2015, 107, 1501–1517. [Google Scholar] [CrossRef]
Carreño-Lopez, R.; Campos-Reales, N.; Elmerich, C.; Baca, B.E. Physiological Evidence for Differently Regulated Tryptophan-Dependent Pathways for Indole-3-Acetic Acid Synthesis in Azospirillum brasilense. Mol. Gen. Genet. 2000, 264, 521–530. [Google Scholar] [CrossRef] [PubMed]
Malhotra, M.; Srivastava, S. Organization of the ipdC Region Regulates IAA Levels in Different Azospirillum brasilense Strains: Molecular and Functional Analysis of ipdC in Strain SM. Environ. Microbiol. 2008, 10, 1365–1373. [Google Scholar] [CrossRef] [PubMed]
Riveros-Rosas, H.; González-Segura, L.; Julián-Sánchez, A.; Díaz-Sánchez, Á.G.; Muñoz-Clares, R.A. Structural Determinants of Substrate Specificity in Aldehyde Dehydrogenases. Chem.-Biol. Interact. 2013, 202, 51–61. [Google Scholar] [CrossRef] [PubMed]
Watanabe, S.; Kodaki, T.; Makino, K. A Novel Alpha-Ketoglutaric Semialdehyde Dehydrogenase: Evolutionary Insight into an Alternative Pathway of Bacterial L-Arabinose Metabolism. J. Biol. Chem. 2006, 281, 28876–28888. [Google Scholar] [CrossRef]
Hosoya, S.; Yamane, K.; Takeuchi, M.; Sato, T. Identification and Characterization of the Bacillus subtilis d-Glucarate/Galactarate Utilization Operon ycbCDEFGHJ. FEMS Microbiol. Lett. 2002, 210, 193–199. [Google Scholar] [CrossRef]
Dobbelaere, S.; Croonenborghs, A.; Thys, A.; Vande Broek, A.; Vanderleyden, J. Phytostimulatory Effect of Azospirillum brasilense Wild Type and Mutant Strains Altered in IAA Production on Wheat. Plant Soil 1999, 212, 155–164. [Google Scholar] [CrossRef]
Xie, C.-H.; Yokota, A. Azospirillum Oryzae sp. Nov., a Nitrogen-Fixing Bacterium Isolated from the Roots of the Rice Plant Oryza Sativa. Int. J. Syst. Evol. Microbiol. 2005, 55, 1435–1438. [Google Scholar] [CrossRef]
Luu, R.A.; Schneider, B.J.; Ho, C.C.; Nesteryuk, V.; Ngwesse, S.E.; Liu, X.; Parales, J.V.; Ditty, J.L.; Parales, R.E. Taxis of Pseudomonas putida F1 toward Phenylacetic Acid Is Mediated by the Energy Taxis Receptor Aer2. Appl. Environ. Microbiol. 2013, 79, 2416–2423. [Google Scholar] [CrossRef]

Figure 1. Correlation analysis was performed to assess the relationship between the number of contigs and ALDH abundance. The numbers inside of circles correspond to genome size [measured in Megabase pair (Mbp)]. In this analysis, 17 Azospirillum genomes retrieved from NCBI database were considered.

Figure 2. Length frequency distribution of proteins with ALDH domain.

Figure 3. Three-level pie chart showing the distribution of ALDH families, subfamilies, and sequence identity within each subfamily.

Figure 4. Phylogenetic tree constructed with 315 ALDH protein sequences from strains of the Azospirillum genus, showing families and clusters identified using the Conserved Domain Database CDD platform (1000 bootstrap).

Figure 5. Number of genes in each ALDH family in 17 complete Azospirillum genomes, with a dendrogram above the table showing the phylogenetic relationship between the strains, as determined using Anvi’o 7.1 software; green and red shadings correspond to higher and lower ALDHs values, respectively.

Figure 6. Multiple sequence alignment of the ALDH family of the Azospirillum genus. Multiple sequence alignment of the ALDH families in the Azospirillum genus was constructed using UGENE software. The ALDH consensus sequence is shown above the alignment. The sequences are denoted by their protein ID, followed by the species abbreviation and the ALDH family. This alignment was used in the superposition analysis of the ALDH models obtained by AlphaFold. The first residue highlighted corresponds to glutamic acid (gray), which activates the catalytic cysteine (purple), which is highly conserved in most ALDH families.

Figure 7. Structural comparison of superimposition of Azospirillum ALDH models (A) Superimposition of the ALDH models obtained by AlphaFold to detect the Rossman fold that integrates the NAD(P)⁺ coenzyme (highlighted in a red square box), the catalytic cysteine, and glutamic acid amino acids, (which interact with the aldehyde substrate and drive the ALDH enzymatic reaction) (panel (B)).

Figure 8. Alluvial diagram showing the localization of Azospirillum genus ALDHs on the chromosome (chr) or other replicons as chromids and plasmids (p01 to p07).

Figure 9. Phylogenetic marker utility of aldH19 gene. A comparison of phylogenetic trees from rpoD and aldH19 genes.

Table 1. Information of Azospirillum strains used in the study.

Genus	Specie	Strain	Country	Isolated from	Last Annotation Updated	Reference
Azospirillum	brasilense	Sp7	Brazil	Agricultural field	2021	[37]
Azospirillum	argentinense	MTCC4035	India	Agricultural field	2021	[38]
Azospirillum	brasilense	MTCC4038	India	Agricultural field	2021	[38]
Azospirillum	brasilense	MTCC4039	India	Agricultural field	2021	[38]
Azospirillum	baldaniorum	Sp245	Brazil	Wheat root	2021	[39]
Azospirillum	argentinense	Az39	Argentina	Wheat rhizosphere	2012	[40]
Azospirillum	lipoferum	4B	Brazil	Wheat	2012	[40]
Azospirillum	brasilense	cd	USA	Grass	2021	[41]
Azospirillum	humicireducens	SgZ-5	China	Humus from microbial fuel cell	2022	[42]
Azospirillum	oryzae	KACC 14407	Republic of Korea	Soil	2021	[43]
Azospirillum	ramasamyi	M2T2B2	Republic of Korea	Cow dung	2021	[44]
Azospirillum	sp	B510	Japan	Rice soil	2021	[45]
Azospirillum	sp	TSA2s	Japan	Rice soil	2019	[46]
Azospirillum	sp	TSH100	Japan	Rice soil	2019	[46]
Azospirillum	sp	TSH 58	Japan	Rice soil	2021	[39]
Azospirillum	termophilum	CFH70021	China	Hot spring	2021	[47]
Azospirillum	thiophilum	BV-S	Russia	Bacterial mat from sulfide spring	2021	[48]

Table 2. Aldehyde dehydrogenase gene families in the Azospirillum genus and their probable function.

#	CDD Cluster	ALDH’s Per-CDD Cluster	Anvi’o COGs	ALDH Family	ALDH Family Putative Function	References
1	cd07097	52	3	αKGS-ALDH	Alpha ketoglutarate semialdehyde dehydrogenase	[50,51]
2	cd07103	38	3	ALDH5	Succinate-5-semialdehyde dehydrogenase	[52]
3	cd07085	30	1	ALDH6	Methyl malonate semialdehyde dehydrogenase	[53,54]
4	cd07138	21	2	6-OL-ALDH	6-oxolauric aldehyde dehydrogenase	[55]
5	PRK13805	19	1	ALDH20	Alcohol dehydrogenase/aldehyde dehydrogenase	[49,56]
6	cd07108	19	1	MSR1-ALDH-LIKE	No function described yet	[57]
7	cd07114	19	5	DhaS	IAC aldehyde dehydrogenase	[58,59,60]
8	COG1012	18	6	ALDH COG1012	No especific function described yet	--
9	cd07559	17	1	ALDH2	AC/IAC aldehyde dehydrogenase	[2,60,61]
10	PRK11905	17	1	ALDH16	Proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase	[62]
11	PRK00197	17	1	ALDH19	Gamma-glutamyl phosphate reductase	[63]
12	cd07130	16	1	ALDH7	α-aminoadipic semialdehyde dehydrogenase	[64,65]
13	cd07078	14	1	ALDH CD7078	No function described yet	---
14	cd07127	9	1	Paaz	Aldehyde dehydrogenase part of oxepin alphaproteobacteria system	[66]
15	cd07105	5	1	Sal-ALDH	Salicylaldehyde dehydrogenase	[67]
16	cd07120	2	1	Furfural-ALDH	Aldehyde dehydrogenase participating in Furfural convertion	[68]
17	cd07152	2	1	Benzal-ALDH	Benzaldehyde dehydrogenase	[69]
Total		315	31

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Aldehyde Dehydrogenase Diversity in Azospirillum Genomes

Abstract

1. Introduction

2. Materials and Methods

2.1. Genome Selection

2.2. Aldehyde Dehydrogenases Sequence Searching

2.3. Comparative Genome Analyses

2.4. Multiple Sequence Alignment and Phylogenetic Analyses

2.5. Protein Modelling, Molecular Conservation, and Structural Analysis

2.6. Aldehyde Dehydrogenase Gene Localization in Azospirillum Genomes

2.7. Aldehyde Dehydrogenase Gene Identification as Potential Phylogenetic Markers

3. Results

3.1. Selection of Azospirillum Genomes

3.2. Number of Aldehyde Dehydrogenases Identified for Each Azospirillum Strain

3.3. Sequence Alignment and Clustering of Aldehyde Dehydrogenases

3.4. Alignment and Phylogenetic Analysis

3.5. Comparison of Structural Models of Aldehyde Dehydrogenase Families

3.6. Analyzing Aldehyde Dehydrogenase Families Found in Azospirillum Genomes

3.7. Aldehyde Dehydrogenases as Phylogenetic Marker

4. Discussion

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics