Next Article in Journal
The Combined Use of Cinnamaldehyde and Vitamin C Is Beneficial for Better Carcass Character and Intestinal Health of Broilers
Previous Article in Journal
Effect of Ferulic Acid Loaded in Nanoparticle on Tissue Transglutaminase Expression Levels in Human Glioblastoma Cell Line
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Analysis of Codon Usage Bias in Six Eimeria Genomes

College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(15), 8398; https://doi.org/10.3390/ijms25158398
Submission received: 28 June 2024 / Revised: 25 July 2024 / Accepted: 30 July 2024 / Published: 1 August 2024
(This article belongs to the Section Molecular Genetics and Genomics)

Abstract

:
The codon usage bias (CUB) of genes encoded by different species’ genomes varies greatly. The analysis of codon usage patterns enriches our comprehension of genetic and evolutionary characteristics across diverse species. In this study, we performed a genome-wide analysis of CUB and its influencing factors in six sequenced Eimeria species that cause coccidiosis in poultry: Eimeria acervulina, Eimeria necatrix, Eimeria brunetti, Eimeria tenella, Eimeria praecox, and Eimeria maxima. The GC content of protein-coding genes varies between 52.67% and 58.24% among the six Eimeria species. The distribution trend of GC content at different codon positions follows GC1 > GC3 > GC2. Most high-frequency codons tend to end with C/G, except in E. maxima. Additionally, there is a positive correlation between GC3 content and GC3s/C3s, but a significantly negative correlation with A3s. Analysis of the ENC-Plot, neutrality plot, and PR2-bias plot suggests that selection pressure has a stronger influence than mutational pressure on CUB in the six Eimeria genomes. Finally, we identified from 11 to 15 optimal codons, with GCA, CAG, and AGC being the most commonly used optimal codons across these species. This study offers a thorough exploration of the relationships between CUB and selection pressures within the protein-coding genes of Eimeria species. Genetic evolution in these species appears to be influenced by mutations and selection pressures. Additionally, the findings shed light on unique characteristics and evolutionary traits specific to the six Eimeria species.

1. Introduction

In the amino acid composition of organisms, individual amino acids may correspond to multiple codons, a phenomenon known as codon degeneracy. Codons encoding the same amino acid are termed synonymous codons. Despite codon degeneracy allowing for multiple codons to encode a single amino acid, various organisms exhibit distinct preferences for specific synonymous codons. The codon serves as the fundamental unit of information for mRNA translation, with 62 codons encoding 20 distinct amino acids [1]. However, across various genes or genomes, the selection of synonymous codons exhibits a non-random pattern, different organisms exhibit preferences for specific codons during amino acid encoding, reflecting a phenomenon known as codon usage bias (CUB) [2]. Although synonymous mutations were traditionally viewed as “silent” mutations due to their lack of impact on protein sequences, research suggests that codon selection during evolution is not entirely neutral [3,4]. For specific species, certain synonymous codons, termed optimised codons, are favoured, while others are used less frequently. Furthermore, codon usage patterns can impact various biological processes such as mRNA synthesis, the rate of translation elongation, protein folding, and other subsequent cellular functions [5,6,7]. Specific synonymous substitutions have notable fitness and phenotypic effects across various organisms, including vertebrates and invertebrates [8].
It is widely recognised that CUB is mainly influenced by mutation pressure, natural selection, and random genetic drift [9,10]. This preference is closely linked to GC content, gene expression level, gene length, tRNA abundance, protein structure, and RNA stability [11,12,13,14]. For instance, highly expressed genes tend to favour codons that match abundant tRNAs, resulting in CUB. The development of CUB is influenced by the interplay between translational selection and mutation pressure [15,16]. Analysing codon usage patterns can offer insights into the evolutionary and adaptive processes of different species, as codon usage may vary among species or even within the same species due to different evolutionary pressures [17]. Additionally, investigating the CUB of pathogens can offer valuable insights into the regulation of pathogenic gene expression, thus contributing to the advancement of more effective vaccine strategies.
Chicken coccidiosis is a prevalent and severe parasitic disease among poultry [18]. It manifests as an acute epidemic protozoal infection caused by one or more species of coccidia [19]. This disease poses a significant threat to young chicks and is particularly prevalent among chickens aged 20–45 days [20]. Its incidence is highest during seasons characterised by temperatures of 25–30 °C and heavy rainfall [21]. The occurrence rate of coccidiosis can reach up to approximately 75%, with mortality rates ranging from 20 to 50% [22]. Affected chicks experience stunted growth and slow weight gain upon recovery. While adult chickens typically remain asymptomatic, carriers may exhibit reduced weight gain and egg production, thereby serving as important vectors for coccidiosis [23]. Chicken coccidiosis inflicts substantial economic losses annually on the global poultry industry [24,25]. Chicken coccidiosis is an intestinal parasitic infection caused by one or more species of coccidia belonging to the phylum Apicomplexa, class Sporozoa, order Eucoccidiorida, family Eimeriidae, and genus Eimeria [26]. Globally, seven species of chicken coccidia have been identified, including Eimeria acervulina, Eimeria necatrix, Eimeria brunetti, Eimeria tenella, Eimeria praecox, Eimeria maxima, and Eimeria mitis [27]. These various species exhibit differing patterns of parasitism and pathogenicity. Notably, E. necatrix and E. tenella are among the most pathogenic species, with E. necatrix predominantly parasitising the mid-portion of the small intestine and E. tenella inhabiting the ceca [28].
At present, there are six complete genomes of the genus Eimeria that have been sequenced and reported. This study aims to analyse the genome-wide codon preferences of six Eimeria-encoded proteins using programming languages and bioinformatics tools. By comprehensively comparing codon usage patterns, this study seeks to enhance heterologous gene expression, improve resistance to coccidiosis, and facilitate the development of vaccines against parasitic diseases. Additionally, this research aims to lay the groundwork for functional genomics and phylogenetic studies in Eimeria, contributing to our understanding of gene origin, protein expression, and gene evolution processes.

2. Results

2.1. Analysis of Nucleotide Composition and Codon Usage in Eimeria

The GC content of the protein-coding genes among the genome of six Eimeria species ranged from 52.67% to 58.24%, with E. maxima having the lowest value and E. necatrix having the highest. The coding sequences (CDSs) in the six Eimeria species exhibited a higher abundance of G and C nucleotides compared to A and T nucleotides. The average GC1 (GC content at the first position of codons) content exceeds that of both GC3 (GC content at the third position of codons) and GC2 (GC content at the second position of codons) in every species, with the distribution trend as GC1 > GC3 > GC2 (Table 1). The GC1 contents of six Eimeria species ranged from 61.99% to 66.09%, among which E. maxima had the lowest value and E. brunetti had the highest value. The GC3 contents of six Eimeria species ranged from 48.71% to 59.75%, among which E. maxima had the lowest value and E. tenella had the highest value. In terms of GC2, the GC2 contents ranged from 45.84% to 50.40% among six Eimeria species, while E. praecox had the lowest value and E. necatrix had the highest value. Comparable trends in nucleotide composition were noted in the third positions of synonymous codons, the GC3s (GC content at the third position of synonymous codons) content ranged from 47.43 (E. maxima) to 58.80 (E. necatrix), with the exception of E. maxima, the values for the other five Eimeria species exceeded 50%.
The overall RSCU (relative synonymous codon usage) value of the six Eimeria genome was calculated (Figure 1). There are 26 codons with RSCU values greater than 1 in E. acervuline, E. necatrix, and E. tenella. Both E. brunetti and E. praecox contain 27 codons with RSCU values greater than 1. In addition, 28 codons with RSCU values greater than 1 were found in E. maxima. Among these high-frequency codons of Eimeria, most codons end with C/G. However, in E. maxima, codons tend to end with A/T more than C/G in high-frequency codons. This is related to the fact that only GC3 in E. maxima is below 50% among six Eimeria species. At the same time, the RSCU value of 31 codons is less than 1 in E. maxima, and the RSCU value of 32 codons is less than 1 in E. brunetti and E. praecox. E. acervuline, E. necatrix, and E. tenella have 33 low-frequency codons. Among these low-frequency codons, except in E. maxima, most of them tend to end with A/T.

2.2. Assessing the Correlation between Codon Usage Metrics

We observed a significant positive correlation between GC3 content and GC3s across six Eimeria species (p < 0.001). Additionally, there was a significant positive correlation between GC3 or GC3s and CBI (codon bias index) across these species. Furthermore, we noted a notable negative correlation between GC3 and A3s, but a significant positive correlation between GC3 and C3s in the same set of Eimeria species (p < 0.001). Moreover, CBI exhibited a significant positive correlation with FOP (frequency of optimal codons) across all Eimeria species (p < 0.001). Additionally, the L_sym (number of synonymous codons) index showed a significant positive correlation with the L_aa (length amino acids) index across all six Eimeria species (p < 0.001). The results indicate that the nucleotide composition can impact the CUB of genes in Eimeria species (Figure 2).

2.3. ENC-Plot Analysis

The average ENC values of the six Eimeria species ranged from 47.37 ± 8.70 to 51.93 ± 5.65, with E. praecox having the lowest value and E. acervulina having the highest. The average ENC values were 47.67 ± 7.89 for E. brunetti, 49.30 ± 7.28 for E. necatrix, 50.25 ± 6.97 for E. tenella, and 51.17 ± 6.97 for E. maxima, suggesting a general random codon usage pattern across the Eimeria genomes (Table 1). Among six Eimeria species, the ENC value of from 1.22% to 7.32% of the genes was less than 35, with E. brunetti having 7.32% of genes, and E. acervulina having 1.22% of genes, indicating that these genes within each species have a strong codon bias.
To assess the relationship between synonymous codon usage patterns and Enc across all genes within each Eimeria genome, the ENC-plot was constructed. The results showed that most genes in each species were located far below the expected ENC-plot curve, and only a small number of genes fell onto the expectation curve (Figure 3). This analysis revealed that the main factor affecting the CUB was selection pressure in six Eimeria species, at the same time, only a small number of coding genes are solely due to mutational pressure that leads to changes in codon usage.
We also calculated each species’ ENC frequency distribution in six Eimeria to test the discrepancy between observed ENC (ENCobs) and expected ENC (ENCexp) values. This analysis revealed that 70.51~82.20% of genes are distributed outside from −0.05 to 0.05, E. brunetti and E. praecox genomes had ratios greater than 80%, while the remaining four genomes had ratio values between 70.51% and 76.59%. These data show that the main factor affecting the codon usage in most protein-coding genes of six Eimeria species is natural selection pressure, and some genes are also affected by mutational bias, which further suggests that the formation of codon bias within these genomes were largely responsible for GC3s.

2.4. PR2-Plot Analysis

The PR2-plot analysis was conducted to assess biases in the third codon position within four codon degenerate amino acids among protein-coding genes across six Eimeria species. According to Chargaff’s second parity rule (PR2), the quantities of A = T and C = G in a DNA strand are equivalent [29]. Each data point on the plot represents a gene, with the plot segmented into four quadrants. The centre of the plot, where both coordinates are 0.5, denotes the equilibrium point where A = T and G = C. Essentially, it signifies the absence of bias in selection or mutation forces within complementary DNA strands [30].
The results indicate that the majority of genes were distributed in the third quadrant among the six Eimeria species. The mean values of GC bias [G3/(G3 + C3)] ranged from 45.41 (E. praecox) to 47.23 (E. brunetti), and AT bias [A3/(A3 + T3)] ranged from 39.62 (E. tenella) to 49.17 (E. praecox), suggesting a pronounced preference for C over G and T over A at the third codon position (Figure 4). This implies a tendency for pyrimidine over purine usage in the third base of codons within Eimeria genomes. Therefore, the CUB of coding genes in the six Eimeria species is influenced not only by mutations but also significantly by other factors such as natural selection.

2.5. Neutrality Plot Analysis

To provide additional insights into the impact of mutational pressure and natural selection of CUB on Eimeria genomes, we performed a neutrality plot analysis with GC12 and GC3 values in each gene. When nucleotide changes result in alterations to the encoded amino acid, it signifies the presence of selection pressure. Conversely, a correlation between GC12 and GC3 likely indicates the influence of mutational forces, as the force-shaping codon bias operates across all codon positions [31]. If the slope of the regression line approaches 1, suggesting that genes are distributed predominantly along the diagonal, it means that CUB is only influenced by mutational pressure alone. As the slope gradually decreases or even diminishes to 0, the impact of natural selection on CUB progressively strengthens. The results reveal statistically significant negative correlations between GC12 and GC3 across all six Eimeria species (p < 0.0001), with r values ranging from −0.04994 (E. tenella) to −0.5918 (E. praecox), and slope values of the regression line ranging from 0.03292 (E. tenella) to 0.3487 (E. praecox). The lower slope values indicate that mutational pressure is not the predominant pressure, which means that, in E. praecox, the proportion of neutrality (mutation pressure) was 34.87%, while the proportion of constraint on GC3 (natural selection) was 65.13%, contrasting with 3.292% and 96.708%, respectively, in E. tenella (Figure 5). The neutrality plot revealed that selection pressure exerted a greater influence than mutational pressure for CUB in six Eimeria genomes.

2.6. Correspondence Analysis

A correspondence analysis (COA) was conducted using the RSCU values of genome-wide protein-coding genes across six Eimeria species to examine codon biases. Axis 1, Axis 2, Axis 3, and Axis 4 accounted for 13.20%, 8.03%, 4.24%, and 3.75% of the average variation rates, respectively, with the first four axes collectively contributing to an average cumulative variation of 29.23%. Axis 1 emerged as the primary factor influencing CUB. Pearson correlation analysis demonstrated a significant relationship (p < 0.05) between the coordinate value of genes on the first axis and ENC, GC3s, GC3, and GC values across the six Eimeria species. To investigate the impact of GC content on CUB within these species, genes were colour-coded based on their GC content. The findings revealed a concentration of genes with GC content exceeding 60% or falling below 45% on the left or right side of the coordinate axis, whereas genes with GC content ranging between 45% and 60% were distributed on both sides of the axis (Figure 6). This observation underscores the influence of both selection pressure and gene mutation on CUB in Eimeria genomes.

2.7. Optimal Codon Analysis of Eimeria Genomes

The comparative analysis revealed that E. acervulina, E. necatrix, E. brunetti, E. tenella, E. praecox, and E. maxima had 14, 13, 13, 15, 11, and 11 optimal codons (ΔRSCU > 0.08 and RSCU > 1), respectively. The majority of optimal codons in Eimeria species end with C or G, except for E. praecox and E. maxima. Among these, GCA, CAG, and AGC are the most commonly used optimal codons favoured in six Eimeria species. Following closely are CAC, CUG, AAC, and ACA, which are preferred in five Eimeria species. UGC is the top optimal codon in four Eimeria species. GAC, GGA, CCA, CGC, UCU, and GUG serve as optimal codons in three out of six Eimeria species. GGC, AAG, UCA, and UAC serve as optimal codons in two Eimeria species. Additionally, individual Eimeria species exclusively favour GAA, UUU, AGG, CGG, and GUU as their optimal codons (Figure 7, Table S2).

2.8. Comparative Analysis of Codon Usage between Eimeria and Other Organisms

Utilising the Codon Usage Database, we conducted a comparative analysis of codon usage frequencies between E. tenella and various other species to ascertain similarities in codon usage preferences. We specifically examined codons in E. tenella (ET) that displayed frequency ratios ≥2 or ≤0.5 when compared with those in Gallus gallus (GG), Toxoplasma gondii (TG), Plasmodium vivax (PV), Cryptosporidium parvum (CP), Entamoeba histolytica (EH), Mus musculus (MM), and Homo sapiens (HS). For each species pair, we identified 2, 5, 18, 35, 43, 4, and 3 such codons, respectively. A lower count of codons suggests a smaller disparity in synonymous CUB between the two species. Consequently, E. tenella displays closer alignment in codon usage preferences with Gallus gallus, Toxoplasma gondii, Mus musculus, and Homo sapiens, while notable discrepancies are observed with Plasmodium vivax, Cryptosporidium parvum, and Entamoeba histolytica (Table 2). The identical analysis performed on E. acervulina, E. necatrix, E. brunetti, E. praecox, and E. maxima produced findings analogous to those observed in E. tenella.

3. Discussion

Numerous studies have demonstrated that the CUB is influenced by a complex interplay between mutational processes and selective pressures throughout the evolutionary history of organisms [32]. Codon selection plays a crucial role in regulating gene expression, as optimal codons can enhance both the efficiency and accuracy of translation [33]. A plethora of biochemical, genetic, biophysical, and bioinformatics investigations have demonstrated that codon preference impacts various gene regulatory mechanisms, encompassing protein translation, co-translational folding, transcription, and post-transcriptional regulatory processes [34]. Moreover, codon usage profoundly influences gene expression and protein functionality across a spectrum of organisms, spanning from lower to higher organisms. In the context of expressing heterologous genes, optimising the codons of the target gene to match the preferred codons of the host species can significantly improve gene expression efficiency [35]. Several studies have highlighted the significant impact of base composition on codon preference [36]. Furthermore, the preferences for the utilisation of bases, codons, and amino acids are also influenced by factors such as gene expression levels, gene functions, and the evolutionary development of the species [37].
The current investigation focuses on elucidating CUB within the genomes of six Eimeria species. Remarkably, the genomes of Eimeria species are GC-rich, with these six species displaying a relatively elevated GC content in their genomes. Across all six genomes, the GC3 content ranges from 48.71% to 59.75%, surpassing that of GC2, with the distribution pattern being GC1 > GC3 > GC2. This investigation demonstrates an unbalanced utilisation of GC and AT bases at the third codon position in Eimeria genomes. In four-codon amino acids, G/C-ending codons appear to be preferred over A/T-ending codons. Additionally, codons ending in C are favoured over those ending in G, and codons ending in T are favoured over those ending in A, demonstrating a preference for pyrimidine bases in the third position. These observations suggest an influence of GC3 bias on codon usage patterns. The human genome exhibits a preference for G or C, particularly in synonymous codons terminating with C [38]. This preference for C or G in the third codon position is also observed in species such as Caenorhabditis elegans, Daphnia pulex, and Drosophila melanogaster [39]. Conversely, species like Borrelia burgdorferi, Mycoplasma capricolum, Onchocerca volvulus, and Plasmodium falciparum exhibit a preference for A or T in the third codon position [40,41,42]. Studies in mammals have revealed that genes with higher GC content tend to exhibit elevated expression levels compared to those with lower GC content [43], warranting further investigation into the potential correlation between Eimeria gene expression levels and GC content.
The CDSs in the six Eimeria species displayed an average ENC ranging from 47.37 ± 8.70 to 51.93 ± 5.65. If codon usage is solely influenced by GC3 content, it suggests the presence of mutational pressure. In such instances, the ENC values tend to be slightly higher than the expected ENC curve [44]. In all six Eimeria genomes, most CDSs showed ENC values below the expected curve, with from 1.22% to 7.32% of their genes exhibiting an ENC below 35, suggesting a significant preference for specific codons and highlighting the dominant influence of natural selection pressure. Following this, we have also explored the influence of natural selection pressure through the neutrality plot analysis. Mutational pressure is inferred as the primary determinant of CUB when the gradient of the regression line approaches 1 and the correlation between GC12 and GC3 achieves statistical significance. Conversely, gradients nearing 0 or displaying a nearly horizontal trajectory suggest that natural selection pressure predominantly shapes CUB. In the neutrality plot analysis, with the slope of the regression line ranging from 0.03292 to 0.3487 in six Eimeria species, most genes tended to diverge considerably from the slope of the regression line, further confirming the dominance of natural selection over mutational forces. Both the neutrality plot and PR2-plot analyses provided compelling evidence supporting the involvement of natural selection in codon bias within Eimeria. This study’s findings underscore that despite variations in codon usage indicators within Eimeria species, the CUB of protein-coding genes observed in these six Eimeria species is influenced by both natural selection pressures and mutational processes. Notably, all six Eimeria species experienced robust natural selection pressures on their protein-coding genes, particularly when considering the base composition. Moreover, no correlation was observed between ENC and GRAVY or AROMO, indicating no influence of hydrophobicity or aromaticity on CUB. Similarly, no significant associations were found between CAI, GRAVY, and AROMO, suggesting minimal impact of these factors on gene expression.
This study analysed the CUB of six Eimeria species at the whole-genome level. The results indicate that there are 26–31 preferred codons in these genomes, with 11–15 of them being optimal codons. The CUB of six Eimeria species are similar to those of Gallus gallus, Toxoplasma gondii, Mus musculus, and Homo sapiens, but markedly different from those of Plasmodium vivax, Cryptosporidium parvum, and Entamoeba histolytica. These findings suggest a potential co-evolutionary relationship between Eimeria and host genomes, all six Eimeria species are well adapted to Gallus gallus. Phylogenetic analysis reveals that Eimeria is more closely related to T. gondii, implying that species with closer phylogenetic relationships tend to share more similar CUB. Coccidiosis has emerged as a significant health threat in poultry, necessitating urgent efforts toward vaccine development and therapeutic discovery. Eimeria species displayed significant adaptation to sequences from both Mus musculus and Homo sapiens, as evidenced by CUB. Therefore, cell lines derived from bats and humans may provide robust support for Eimeria gene replication. Our findings provide valuable insights for selecting optimal experimental cell lines for vaccine development, heterologous gene expression studies, and research related to pathogenicity. Identifying distinct codon patterns is essential for understanding gene expression and evolutionary impacts on the genome. It also aids in phylogenetic analysis and optimising gene expression through codon optimisation. Numerous studies emphasise the association between CUB and gene expression levels, impacting translation efficiency throughout the proteome [45]. Translational selection affects both codon and amino acid usage, with highly expressed genes favouring amino acids with low or intermediate size/complexity scores, such as alanine and glycine, and disfavouring those with high scores, such as cysteine [46].
Different organisms use varying frequencies of different codons to encode the same amino acid. To enhance the expression of exogenous proteins, optimising inserted foreign gene sequences based on the codon bias of the target organism is essential. This optimisation primarily aims to reduce rare codon usage, thereby increasing transcription speed and lowering error rates [47]. Moreover, optimised genes often feature higher guanine and cytosine nucleotide content, which enhances mRNA stability and potentially improves mRNA transport efficiency from the nucleus to the cytoplasm, thus boosting exogenous protein expression. Advances in biotechnology have facilitated the prediction of optimised gene sequences based on the target protein’s sequence, followed by the synthesis of these optimised foreign genes using artificial methods. In vaccine development, codon optimisation has proven effective in enhancing antigen protein expression levels and is widely applied with successful outcomes [48]. Regarding the adaptation of Eimeria species to host species, they typically rely on the host cell’s gene expression machinery to synthesise their own proteins. This includes utilising the host’s tRNA molecules for translation. Eimeria species have evolved mechanisms to interact with and manipulate host cell processes, allowing them to exploit host resources for their own replication and survival within the host’s intestinal cells during infection [49]. This adaptation is critical for their lifecycle and pathogenicity in poultry. Future research should investigate correlations among codon usage, amino acid frequency, and expression levels. Furthermore, our findings offer theoretical guidance for the functional genomic study of Eimeria genomes and the vaccine strategy of coccidiosis in poultry.

4. Materials and Methods

4.1. Genomic Data

The genomic data and annotations for E. acervulina, E. necatrix, E. brunetti, E. tenella, E. praecox, and E. maxima were retrieved from the NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/, accessed on 1 June 2024) (Table S1). A customised Python script was utilised to filter genes based on specific criteria pertaining to CDS: sequences exceeding 300 base pairs in length, with the number of bases being a multiple of three, and containing complete start and stop codons. Subsequently, a total of 5717 coding genes for E. acervulina, 7222 for E. necatrix, 8006 for E. brunetti, 6519 for E. tenella, 7096 for E. praecox, and 5205 for E. maxima were identified and retained in the filtered sequence files, respectively.

4.2. Calculation of Codon Related Parameters

Among common codons, ATG and TGG encode only one amino acid, while TAA, TAG, and TGA function as stop codons. These five codons were excluded, and subsequent bioinformatics analysis focused on the remaining codons. The effective number of codons (ENC) reflects the codon diversity within the gene. An ENC value of 20 suggests one codon per amino acid, while 61 indicates the average usage of each codon. The nucleotide composition of CDSs was assessed, focusing specifically on various GC-related metrics. GC signifies the total count of guanine (G) and cytosine (C) nucleotides in each gene, while GC1, GC2, and GC3 represent the counts of G and C nucleotides at the first, second, and third positions of each codon in the gene, respectively. Additionally, GC12 denotes the average count of G and C nucleotides at the first and second positions of codons. Furthermore, T3s, C3s, A3s, and G3s denote the frequencies of thymine (T), cytosine (C), adenine (A), and guanine (G) nucleotides, respectively, at the third position of codons within CDSs. Lastly, GC3s represent the GC content specifically at the third position of synonymous codons. General average hydropathicity (GRAVY) values, ranging from −2 to 2, are obtained by summing the hydropathy values of amino acids in polymerase gene sequences and multiplying by the number of residues. Positive and negative GRAVY values represent hydrophobic and hydrophilic proteins, respectively. The aromaticity (AROMO) value reflects the frequency of aromatic amino acids (Phe, Tyr, and Trp). GRAVY and AROMO values serve as indicators of amino acid usage, and changes in amino acid composition can impact codon usage analysis results. Relative synonymous codon usage (RSCU) denotes the ratio of the observed frequency of codons to the expected frequency, assuming equal usage of all synonymous codons for the same amino acids. RSCU values greater than 1 signify positive codon bias, values less than 1 indicate negative bias, and values equal to 1 denote random codon usage. DAMBE 7.3.11 software [50], CodonW 1.4.2 software [51], and custom Biopython scripts were utilised for analysing the aforementioned parameters.

4.3. ENC-Plot Analysis

The ENC-Plot is commonly utilised to assess whether codon usage in a particular gene is influenced solely by mutation or by other factors. The ENC-Plot was generated using R programming language, with ENC values plotted on the ordinate and GC3s values on the abscissa. The expected curve of ENC is calculated by the formula: ENCexp = 2 + GC3s + 29/[GC3s2 + (1 − GC3s)2] [52]. When the data points cluster around this expected curve, it suggests that mutation pressure independently contributes to codon bias formation. Conversely, if data points deviate significantly from the expected curve, it indicates the involvement of other factors, such as natural selection, in shaping codon bias. Furthermore, we evaluated the discrepancies between the expected and actual ENC values using the ENCratio index by the formula: ENCratio = (ENCexp − ENCobs)/ENC. The ENCratio value quantifies the extent of variation between the expected and observed ENC values.

4.4. PR2-Plot Analysis

The PR2-plot was generated using the R programming language to analyse the proportional relationship between purine and pyrimidine at the third base of each four-codon degenerate amino acid. The four-codon degenerate amino acids are alanine (GCT, GCG, GCC, GCA), glycine (GGT, GGG, GGC, GGA), proline (CCA, CCC, CCT, CCG), threonine (ACC, ACA, ACG, ACT), valine (GTT, GTG, GTC, GTA), leucine (CTA, CTC, CTG, CTT), serine (TCA, TCC, TCG, TCU), and arginine (CGA, CGC, CGG, CGU). We employed A3/(A3 + T3) as the vertical axis and G3/(G3 + C3) as the horizontal axis, where A3, T3, G3, and C3 denote the content of A, T, G, and C in the third codon position, respectively. These values for the four-codon degenerate amino acids were computed using a custom Biopython script. The vectors extending from the centre point to other points delineate the preferred orientation and strength of purine or pyrimidine bias on the third base of codons.

4.5. Neutrality Plot Analysis

The neutrality plot is primarily employed to examine the relationship between GC12 and GC3. In this study, the neutrality plot was constructed using the R programming language to assess the intricate interplay between mutational pressure and natural selection in shaping CUB within genes in six Eimeria species. GC3 values were plotted on the abscissa, while GC12 values were plotted on the ordinate. A regression line through the plotting of GC3s against GC12s was applied to fit the plot. The significance of the correlation observed in relation to the slope of the regression line indicates the impact of mutational forces on the overall outcome [53].

4.6. Analysis of Optimal Codons in Eimeria Genomes

The ENC values of CDS from six Eimeria species were computed, respectively. Subsequently, CDS sequences from each species were sorted based on their ENC values, with the lowest and highest 10% identified and separated to construct high- and low-expression libraries, respectively. Specifically, sequences with low ENC values were categorised as high-expression libraries, while those with high ENC values formed the low-expression libraries. Following this, the RSCU and ΔRSCU values were calculated for each group. Codons with RSCU values exceeding 1 were considered high-frequency codons, whereas those with ΔRSCU values exceeding 0.08 were deemed high-expression codons. Codons meeting both criteria were designated as optimal codons.

4.7. Comparative Analysis of Codon Usage between Eimeria and Other Organisms

The codon usage of the Eimeria tenella genome was compared with that of Gallus gallus, Toxoplasma gondii, Plasmodium vivax, Cryptosporidium parvum, Entamoeba histolytica, Mus musculus, and Homo sapiens. Codon usage tables for various species were obtained from the Codon Usage Database (http://www.kazusa.or.jp/codon/, accessed on 1 June 2024) [54]. A frequency ratio between 0.50 and 2.00 for synonymous codon usage in different species suggests a tendency for both species to employ the same synonymous codon. Conversely, if the ratio falls outside the 0.50–2.00 range, it indicates a preference for a specific synonymous codon in one or both of the compared species.

4.8. Statistical Analysis

The data are expressed as means ± standard deviation (SD), and statistical analyses were conducted using GraphPad Prism 8. Significance levels were set at * p < 0.05 and ** p < 0.001. Group mean differences were assessed through either one-way ANOVA or Student’s t-test.

Supplementary Materials

The supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms25158398/s1.

Author Contributions

Conceptualization, Y.Z.; methodology, Y.Z.; software, Y.Z.; validation, Y.Z. and S.Z.; formal analysis, Y.Z. and S.Z.; investigation, Y.Z. and S.Z.; resources, Y.Z. and S.Z.; data curation, Y.Z. and S.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, Y.Z.; visualization, S.Z.; supervision, Y.Z.; project administration, Y.Z.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Scientific and Technological Innovation Foundation of Gansu Agricultural University-PhD research startup foundation (GAU-KYQD-2019-26).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article and Supplementary Materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Sharp, P.M.; Li, W.-H. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987, 15, 1281–1295. [Google Scholar] [CrossRef]
  2. Plotkin, J.B.; Kudla, G. Synonymous but not the same: The causes and consequences of codon bias. Nat. Rev. Genet. 2011, 12, 32–42. [Google Scholar] [CrossRef]
  3. Yannai, A.; Katz, S.; Hershberg, R. The codon usage of lowly expressed genes is subject to natural selection. Genome Biol. Evol. 2018, 10, 1237–1246. [Google Scholar] [CrossRef]
  4. Machado, H.E.; Lawrie, D.S.; Petrov, D.A. Pervasive strong selection at the level of codon usage bias in Drosophila melanogaster. Genetics 2020, 214, 511–528. [Google Scholar] [CrossRef]
  5. Zalucki, Y.M.; Power, P.M.; Jennings, M.P. Selection for efficient translation initiation biases codon usage at second amino acid position in secretory proteins. Nucleic Acids Res. 2007, 35, 5748–5754. [Google Scholar] [CrossRef]
  6. Zalucki, Y.M.; Beacham, I.R.; Jennings, M.P. Biased codon usage in signal peptides: A role in protein export. Trends Microbiol. 2009, 17, 146–150. [Google Scholar] [CrossRef]
  7. Guan, D.-L.; Ma, L.-B.; Khan, M.S.; Zhang, X.-X.; Xu, S.-Q.; Xie, J.-Y. Analysis of codon usage patterns in Hirudinaria manillensis reveals a preference for GC-ending codons caused by dominant selection constraints. BMC Genom. 2018, 19, 542. [Google Scholar] [CrossRef]
  8. Iriarte, A.; Lamolle, G.; Musto, H. Codon usage bias: An endless tale. J. Mol. Evol. 2021, 89, 589–593. [Google Scholar] [CrossRef]
  9. Chen, S.L.; Lee, W.; Hottes, A.K.; Shapiro, L.; McAdams, H.H. Codon usage between genomes is constrained by genome-wide mutational processes. Proc. Natl. Acad. Sci. USA 2004, 101, 3480–3485. [Google Scholar] [CrossRef]
  10. Hanson, G.; Coller, J. Codon optimality, bias and usage in translation and mRNA decay. Nat. Rev. Mol. Cell Biol. 2018, 19, 20–30. [Google Scholar] [CrossRef]
  11. Frumkin, I.; Lajoie, M.J.; Gregg, C.J.; Hornung, G.; Church, G.M.; Pilpel, Y. Codon usage of highly expressed genes affects proteome-wide translation efficiency. Proc. Natl. Acad. Sci. USA 2018, 115, E4940–E4949. [Google Scholar] [CrossRef] [PubMed]
  12. Parvathy, S.T.; Udayasuriyan, V.; Bhadana, V. Codon usage bias. Mol. Biol. Rep. 2022, 49, 539–565. [Google Scholar] [CrossRef]
  13. Shen, X.; Song, S.; Li, C.; Zhang, J. Synonymous mutations in representative yeast genes are mostly strongly non-neutral. Nature 2022, 606, 725–731. [Google Scholar] [CrossRef]
  14. Weissman, J.L.; Hou, S.; Fuhrman, J.A. Estimating maximal microbial growth rates from cultures, metagenomes, and single cells via codon usage patterns. Proc. Natl. Acad. Sci. USA 2021, 118, e2016810118. [Google Scholar] [CrossRef] [PubMed]
  15. Presnyak, V.; Alhusaini, N.; Chen, Y.-H.; Martin, S.; Morris, N.; Kline, N.; Olson, S.; Weinberg, D.; Baker, K.E.; Graveley, B.R. Codon optimality is a major determinant of mRNA stability. Cell 2015, 160, 1111–1124. [Google Scholar] [CrossRef]
  16. Torrent, M.; Chalancon, G.; De Groot, N.S.; Wuster, A.; Madan Babu, M. Cells alter their tRNA abundance to selectively regulate protein synthesis during stress conditions. Sci. Signal. 2018, 11, eaat6409. [Google Scholar] [CrossRef]
  17. Mittal, P.; Brindle, J.; Stephen, J.; Plotkin, J.B.; Kudla, G. Codon usage influences fitness through RNA toxicity. Proc. Natl. Acad. Sci. USA 2018, 115, 8639–8644. [Google Scholar] [CrossRef]
  18. Fatoba, A.J.; Adeleke, M.A. Diagnosis and control of chicken coccidiosis: A recent update. J. Parasit. Dis. 2018, 42, 483–493. [Google Scholar] [CrossRef]
  19. Mesa-Pineda, C.; Navarro-Ruíz, J.L.; López-Osorio, S.; Chaparro-Gutiérrez, J.J.; Gómez-Osorio, L.M. Chicken coccidiosis: From the parasite lifecycle to control of the disease. Front. Vet. Sci. 2021, 8, 787653. [Google Scholar] [CrossRef]
  20. Williams, R. Tracing the emergence of drug-resistance in coccidia (Eimeria spp.) of commercial broiler flocks medicated with decoquinate for the first time in the United Kingdom. Vet. Parasitol. 2006, 135, 1–14. [Google Scholar] [CrossRef]
  21. Attree, E.; Sanchez-Arsuaga, G.; Jones, M.; Xia, D.; Marugan-Hernandez, V.; Blake, D.; Tomley, F. Controlling the causative agents of coccidiosis in domestic chickens; an eye on the past and considerations for the future. CABI Agric. Biosci. 2021, 2, 37. [Google Scholar] [CrossRef]
  22. Dalloul, R.A.; Lillehoj, H.S. Poultry coccidiosis: Recent advancements in control measures and vaccine development. Expert Rev. Vaccines 2006, 5, 143–163. [Google Scholar] [CrossRef]
  23. Ahmad, R.; Yu, Y.-H.; Hua, K.-F.; Chen, W.-J.; Zaborski, D.; Dybus, A.; Hsiao, F.S.-H.; Cheng, Y.-H. Management and control of coccidiosis in poultry—A review. Anim. Biosci. 2024, 37, 1. [Google Scholar] [CrossRef]
  24. Fornace, K.M.; Clark, E.L.; Macdonald, S.E.; Namangala, B.; Karimuribo, E.; Awuni, J.A.; Thieme, O.; Blake, D.P.; Rushton, J. Occurrence of Eimeria species parasites on small-scale commercial chicken farms in Africa and indication of economic profitability. PLoS ONE 2013, 8, e84254. [Google Scholar] [CrossRef]
  25. Shirley, M.W.; Smith, A.L.; Tomley, F.M. The biology of avian Eimeria with an emphasis on their control by vaccination. Adv. Parasitol. 2005, 60, 285–330. [Google Scholar]
  26. Fayer, R. Epidemiology of protozoan infections: The coccidia. Vet. Parasitol. 1980, 6, 75–103. [Google Scholar] [CrossRef]
  27. Lillehoj, E.P.; Yun, C.H.; Lillehoj, H.S. Vaccines against the avian enteropathogens Eimeria, Cryptosporidium and Salmonella. Anim. Health Res. Rev. 2000, 1, 47–65. [Google Scholar] [CrossRef]
  28. McDonald, V.; Shirley, M. The endogenous development of virulent strains and attenuated precocious lines of Eimeria tenella and E. necatrix. J. Parasitol. 1987, 73, 993–997. [Google Scholar] [CrossRef]
  29. Sueoka, N. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+ C content of third codon position. Gene 1999, 238, 53–58. [Google Scholar] [CrossRef] [PubMed]
  30. Sueoka, N. Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J. Mol. Evol. 1995, 40, 318–325. [Google Scholar] [CrossRef] [PubMed]
  31. Khandia, R.; Singhal, S.; Kumar, U.; Ansari, A.; Tiwari, R.; Dhama, K.; Das, J.; Munjal, A.; Singh, R.K. Analysis of Nipah virus codon usage and adaptation to hosts. Front. Microbiol. 2019, 10, 886. [Google Scholar] [CrossRef]
  32. Duret, L. Evolution of synonymous codon usage in metazoans. Curr. Opin. Genet. Dev. 2002, 12, 640–649. [Google Scholar] [CrossRef]
  33. Bazzini, A.A.; Del Viso, F.; Moreno-Mateos, M.A.; Johnstone, T.G.; Vejnar, C.E.; Qin, Y.; Yao, J.; Khokha, M.K.; Giraldez, A.J. Codon identity regulates mRNA stability and translation efficiency during the maternal-to-zygotic transition. EMBO J. 2016, 35, 2087–2103. [Google Scholar] [CrossRef]
  34. Zhou, Z.; Dang, Y.; Zhou, M.; Li, L.; Yu, C.-h.; Fu, J.; Chen, S.; Liu, Y. Codon usage is an important determinant of gene expression levels largely through its effects on transcription. Proc. Natl. Acad. Sci. USA 2016, 113, E6117–E6125. [Google Scholar] [CrossRef] [PubMed]
  35. Zhao, Y.; Huang, G.; Zhang, W. Mutations in NlInR1 affect normal growth and lifespan in the brown planthopper Nilaparvata lugens. Insect Biochem. Mol. Biol. 2019, 115, 103246. [Google Scholar] [CrossRef]
  36. De la Fuente, R.; Díaz-Villanueva, W.; Arnau, V.; Moya, A. Genomic signature in evolutionary biology: A review. Biology 2023, 12, 322. [Google Scholar] [CrossRef] [PubMed]
  37. Khandia, R.; Gurjar, P.; Kamal, M.A.; Greig, N.H. Relative synonymous codon usage and codon pair analysis of depression associated genes. Sci. Rep. 2024, 14, 3502. [Google Scholar] [CrossRef]
  38. Dhindsa, R.S.; Copeland, B.R.; Mustoe, A.M.; Goldstein, D.B. Natural selection shapes codon usage in the human genome. Am. J. Hum. Genet. 2020, 107, 83–95. [Google Scholar] [CrossRef]
  39. Liu, Y.; Yang, Q.; Zhao, F. Synonymous but not silent: The codon usage code for gene expression and protein folding. Annu. Rev. Biochem. 2021, 90, 375–401. [Google Scholar] [CrossRef]
  40. Muto, A.; Yamao, F.; Osawa, S. The genome of Mycoplasma capricolum. Prog. Nucleic Acid Res. Mol. Biol. 1987, 34, 29–58. [Google Scholar] [PubMed]
  41. Saul, A.; Battistutta, D. Codon usage in Plasmodium falciparum. Mol. Biochem. Parasitol. 1988, 27, 35–42. [Google Scholar] [CrossRef] [PubMed]
  42. Milhon, J.L.; Tracy, J.W. Updated codon usage in Schistosoma. Exp. Parasitol. 1995, 80, 353–356. [Google Scholar] [CrossRef] [PubMed]
  43. Radrizzani, S.; Kudla, G.; Izsvák, Z.; Hurst, L.D. Selection on synonymous sites: The unwanted transcript hypothesis. Nat. Rev. Genet. 2024, 25, 431–448. [Google Scholar] [CrossRef] [PubMed]
  44. He, B.; Dong, H.; Jiang, C.; Cao, F.; Tao, S.; Xu, L.-a. Analysis of codon usage patterns in Ginkgo biloba reveals codon usage tendency from A/U-ending to G/C-ending. Sci. Rep. 2016, 6, 35927. [Google Scholar] [CrossRef] [PubMed]
  45. Williford, A.; Demuth, J.P. Gene expression levels are correlated with synonymous codon usage, amino acid composition, and gene architecture in the red flour beetle, Tribolium castaneum. Mol. Biol. Evol. 2012, 29, 3755–3766. [Google Scholar] [CrossRef] [PubMed]
  46. Whittle, C.A.; Extavour, C.G. Expression-linked patterns of codon usage, amino acid frequency, and protein length in the basally branching arthropod Parasteatoda tepidariorum. Genome Biol. Evol. 2016, 8, 2722–2736. [Google Scholar] [CrossRef] [PubMed]
  47. Gustafsson, C.; Govindarajan, S.; Minshull, J. Codon bias and heterologous protein expression. Trends Biotechnol. 2004, 22, 346–353. [Google Scholar] [CrossRef] [PubMed]
  48. Elena, C.; Ravasi, P.; Castelli, M.E.; Peirú, S.; Menzella, H.G. Expression of codon optimized genes in microbial systems: Current industrial applications and perspectives. Front. Microbiol. 2014, 5, 21. [Google Scholar] [CrossRef] [PubMed]
  49. Xu, L.; Li, X. Conserved proteins of Eimeria and their applications to develop universal subunit vaccine against chicken coccidiosis. Vet. Vaccine 2024, 3, 100068. [Google Scholar] [CrossRef]
  50. Xia, X.; Xie, Z. DAMBE: Software package for data analysis in molecular biology and evolution. J. Hered. 2001, 92, 371–373. [Google Scholar] [CrossRef]
  51. Peden, J.F. Analysis of Codon Usage. Ph.D. Thesis, University of Nottingham, Nottingham, UK, 2000. [Google Scholar]
  52. Wright, F. The ‘effective number of codons’ used in a gene. Gene 1990, 87, 23–29. [Google Scholar] [CrossRef] [PubMed]
  53. Nasrullah, I.; Butt, A.M.; Tahir, S.; Idrees, M.; Tong, Y. Genomic analysis of codon usage shows influence of mutation pressure, natural selection, and host features on Marburg virus evolution. BMC Evol. Biol. 2015, 15, 174. [Google Scholar] [CrossRef]
  54. Nakamura, Y.; Gojobori, T.; Ikemura, T. Codon usage tabulated from international DNA sequence databases: Status for the year 2000. Nucleic Acids Res. 2000, 28, 292. [Google Scholar] [CrossRef] [PubMed]
Figure 1. An examination of RSCU was conducted on protein-coding genes derived from six Eimeria species, specifically, E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Figure 1. An examination of RSCU was conducted on protein-coding genes derived from six Eimeria species, specifically, E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Ijms 25 08398 g001
Figure 2. Correlation analysis among different indices across six Eimeria species. Dark blue indicates a positive correlation, while dark red indicates a negative correlation. A higher value indicates a more significant correlation. Asterisks (*) denote statistically significant correlation alterations between the two indicators at a significance level of p < 0.05, and double asterisks (**) indicate significant correlations at the p < 0.001 level. The six Eimeria species, listed from left to right and from top to bottom, include E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima. T3s, C3s, A3s, GC3s: compositions of third synonymous codons. CAI: codon adaptation index. CBI: codon bias index. Fop: frequency of optimal codons. Nc: effective number of codons. L_sym: number of synonymous codons. L_aa: length amino acids. Gravy: grand average of hydropathicity. Aromo: aromaticity.
Figure 2. Correlation analysis among different indices across six Eimeria species. Dark blue indicates a positive correlation, while dark red indicates a negative correlation. A higher value indicates a more significant correlation. Asterisks (*) denote statistically significant correlation alterations between the two indicators at a significance level of p < 0.05, and double asterisks (**) indicate significant correlations at the p < 0.001 level. The six Eimeria species, listed from left to right and from top to bottom, include E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima. T3s, C3s, A3s, GC3s: compositions of third synonymous codons. CAI: codon adaptation index. CBI: codon bias index. Fop: frequency of optimal codons. Nc: effective number of codons. L_sym: number of synonymous codons. L_aa: length amino acids. Gravy: grand average of hydropathicity. Aromo: aromaticity.
Ijms 25 08398 g002
Figure 3. The ENC-plot analysis was conducted on protein-coding genes across six Eimeria species. The red solid line in the plot represents the expected curve under the assumption that codon usage bias is solely influenced by mutation pressure. The species analysed include E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Figure 3. The ENC-plot analysis was conducted on protein-coding genes across six Eimeria species. The red solid line in the plot represents the expected curve under the assumption that codon usage bias is solely influenced by mutation pressure. The species analysed include E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Ijms 25 08398 g003
Figure 4. The PR2-plot analysis was performed on protein-coding genes across six Eimeria species. The species examined include E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Figure 4. The PR2-plot analysis was performed on protein-coding genes across six Eimeria species. The species examined include E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Ijms 25 08398 g004
Figure 5. The neutrality plot analysis was conducted on GC12 and GC3 for the protein-coding genes across six Eimeria species. The species analysed are as follows: E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Figure 5. The neutrality plot analysis was conducted on GC12 and GC3 for the protein-coding genes across six Eimeria species. The species analysed are as follows: E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Ijms 25 08398 g005
Figure 6. Correspondence analysis (COA) utilising the relative synonymous codon usage (RSCU) values obtained from protein-coding genes within six Eimeria species. In the graphical representation, red denotes genes with GC content falling below 45%, green represents genes with GC content ranging between 45% and 60%, and blue indicates genes with GC content exceeding 60%. The species included are labelled as follows: E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Figure 6. Correspondence analysis (COA) utilising the relative synonymous codon usage (RSCU) values obtained from protein-coding genes within six Eimeria species. In the graphical representation, red denotes genes with GC content falling below 45%, green represents genes with GC content ranging between 45% and 60%, and blue indicates genes with GC content exceeding 60%. The species included are labelled as follows: E. acervulina; E. necatrix; E. brunetti; E. tenella; E. praecox and E. maxima.
Ijms 25 08398 g006
Figure 7. In the examination of optimal codon usage in six Eimeria species, codons meeting the criteria of ΔRSCU > 0.08 and RSCU > 1 are indicated in a single asterisk. Codons exhibiting high ΔRSCU are highlighted in yellow, while codons with low ΔRSCU are denoted in purple.
Figure 7. In the examination of optimal codon usage in six Eimeria species, codons meeting the criteria of ΔRSCU > 0.08 and RSCU > 1 are indicated in a single asterisk. Codons exhibiting high ΔRSCU are highlighted in yellow, while codons with low ΔRSCU are denoted in purple.
Ijms 25 08398 g007
Table 1. Average GC content and ENC values of protein-coding genes in six Eimeria species.
Table 1. Average GC content and ENC values of protein-coding genes in six Eimeria species.
SpeciesGCGC1GC2GC3GC3sENC
E. acervuline54.7762.7148.8352.7851.6451.92
E. necatrix58.2464.6150.4059.7258.8049.30
E. brunetti57.1266.0948.4556.8355.9347.67
E. tenella57.5963.6449.3859.7558.7850.25
E. praecox54.5965.6045.8452.3351.3747.37
E. maxima52.6761.9947.3048.7147.4351.17
GC: GC content of all codons; GC1: GC content at the first position of codons; GC2: GC content at the second position of codons; GC3: GC content at the third position of codons; GC3s: GC content at the third position of synonymous codons. ENC: effective number of codons.
Table 2. Comparison of synonymous codon usage between E. tenella and other species.
Table 2. Comparison of synonymous codon usage between E. tenella and other species.
Amino AcidCodonCodon FrequencyRatio
ETGGTGPVCPEHMMHSET/GGET/TGET/PVET/CPET/EHET/MMET/HS
PheUUU16.316.813.322.634.631.317.217.60.971.230.720.470.520.950.93
UUC17.320.2251712.111.321.820.30.860.691.021.431.530.790.85
LeuUUA6.272.613.733.743.66.77.70.892.380.450.180.140.930.81
UUG18.712.614.518.917.36.313.412.91.481.290.991.082.971.41.45
CUU16.412.415920.326.913.413.21.321.091.820.810.611.221.24
CUC17.816.827.1145.22.120.219.61.060.661.273.428.480.880.91
CUA8.263.81010.42.78.17.21.372.160.820.793.041.011.14
CUG41.438.52417.43.70.639.539.61.081.732.3811.19691.051.05
IleAUU14.716.812.222.749.359.415.4160.881.20.650.300.250.950.92
AUC11.42217.416.910.65.922.520.80.520.660.671.081.930.510.55
AUA7.98.82.916.524.414.67.47.50.902.720.480.320.541.071.05
ValGUU15.313.114.712.824.44210.7111.171.041.20.630.361.431.39
GUC14.113.625.4104.84.315.414.51.040.561.412.943.280.920.97
GUA7.57.85.71515.816.97.47.10.961.320.50.470.441.011.06
GUG19.928.221.419.74.92.628.428.10.710.931.014.067.650.70.71
SerUCU14.814.120.39.128.317.116.215.21.050.731.630.520.870.910.97
UCC1215.715.213.77.91.518.117.70.760.790.881.5280.660.68
UCA1111.68.19.231.427.811.812.20.951.361.20.350.40.930.9
UCG9.85.2198.24.30.74.24.41.880.521.22.28142.332.23
AGU8.711.29.312.718.41712.712.10.780.940.690.470.510.690.72
AGC28.620.216.117.88.61.619.719.51.421.781.613.3317.881.451.47
ProCCU14.215.315.45.712.98.818.417.50.930.922.491.101.610.770.81
CCC14.41713.69.42.80.718.219.80.851.061.535.1420.570.790.73
CCA13.115.710.413.420.827.417.316.90.831.260.980.630.480.760.78
CCG11.17.817.34.11.90.26.26.91.420.642.715.8455.51.791.61
ThrACU13.613.312.311.821.625.513.713.11.021.111.150.630.530.991.04
ACC10.216.51315.45.52.91918.90.620.780.661.853.520.540.54
ACA15.616.113.214.320.427.51615.10.971.181.090.760.570.981.03
ACG11.57.716.411.12.61.15.66.11.490.71.044.4210.452.051.89
AlaGCU34.620.820.511.71726.92018.41.661.692.962.041.291.731.88
GCC20.522.922.616.33.92.72627.70.900.911.265.267.590.790.74
GCA42.51920.826.617.527.515.815.82.242.041.62.431.552.692.69
GCG19.99.131.513.31.90.56.47.42.190.631.510.4739.83.112.69
TyrUAU7.511.85.115.125.932.112.212.20.641.470.50.290.230.610.61
UAC12.317.814.526.88.93.816.115.30.690.850.461.383.240.760.8
HisCAU8.69.57.96.914.115.810.610.90.911.091.250.610.540.810.79
CAC13.114.413.410.24.62.115.315.10.910.981.282.856.240.860.87
GlnCAA18.512.113.720.627.5371212.31.531.350.90.670.51.541.5
CAG44.732.624.514.89.11.734.134.21.371.823.024.9126.291.311.31
AsnAAU10.216.910.134.85848.115.6170.601.010.290.180.210.650.6
AAC17.622.518.932.716.8820.319.10.780.930.541.052.20.870.92
LysAAA21.727.318.453.748.564.121.924.40.791.180.40.450.340.990.89
AAG2434.329.851.325.916.933.631.90.700.810.470.931.420.710.75
AspGAU14.225.316.328.640.8452121.80.560.870.50.350.320.680.65
GAC24.224.934.326.210.78.42625.10.970.710.922.262.880.930.96
GluGAA28.43132.754.150.465.527290.920.870.520.560.431.050.98
GAG31.440.940.634.419.36.439.439.60.770.770.911.634.910.80.79
CysUGU7.88.87.29.913.119.111.410.60.891.080.790.600.410.680.74
UGC19.213.312.410.56.5212.312.61.441.551.832.959.61.561.52
ArgCGU85.48.21.53.64.44.74.51.480.985.332.221.821.71.78
CGC14.710.417.52.70.90.19.410.41.410.845.4416.331471.561.41
CGA6.35.312.72.32.136.66.21.190.52.743.002.10.951.02
CGG9.99.710.11.40.40.210.211.41.020.987.0724.7549.50.970.87
AGA10.412.212.810.925.726.312.112.20.850.810.950.400.40.860.85
AGG10.411.78.78.66.91.612.2120.891.21.211.516.50.850.87
GlyGGU10.511.414.610.915.917.211.410.80.920.720.960.660.610.920.97
GGC24.119.728.512.45.40.621.222.21.220.851.944.4640.171.141.09
GGA16.817.621.422.424.841.716.816.50.950.790.750.680.411.02
GGG15.61614.610.65.72.615.216.50.981.071.472.7461.030.95
ET: E. tenella; GG: Gallus gallus; TG: Toxoplasma gondii; PV: Plasmodium vivax; CP: Cryptosporidium parvum; EH: Entamoeba histolytica; MM: Mus muscculus; HS: Homo sapiens.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, Y.; Zhang, S. Comparative Analysis of Codon Usage Bias in Six Eimeria Genomes. Int. J. Mol. Sci. 2024, 25, 8398. https://doi.org/10.3390/ijms25158398

AMA Style

Zhao Y, Zhang S. Comparative Analysis of Codon Usage Bias in Six Eimeria Genomes. International Journal of Molecular Sciences. 2024; 25(15):8398. https://doi.org/10.3390/ijms25158398

Chicago/Turabian Style

Zhao, Yu, and Shicheng Zhang. 2024. "Comparative Analysis of Codon Usage Bias in Six Eimeria Genomes" International Journal of Molecular Sciences 25, no. 15: 8398. https://doi.org/10.3390/ijms25158398

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop