Next Article in Journal
Distribution of Interferon Lambda 4 Single Nucleotide Polymorphism rs11322783 Genotypes in Patients with COVID-19
Next Article in Special Issue
Assessment of Psyllid Handling and DNA Extraction Methods in the Detection of ‘Candidatus Liberibacter Solanacearum’ by qPCR
Previous Article in Journal
Control of Fungal Diseases and Fruit Yield Improvement of Strawberry Using Bacillus velezensis CE 100
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gene Overlapping as a Modulator of Begomovirus Evolution

by
Iván Martín-Hernández
1,† and
Israel Pagán
1,2,*
1
Centro de Biotecnología y Genómica de Plantas UPM-INIA, 28223 Madrid, Spain
2
Departamento de Biotecnología—Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid, 28045 Madrid, Spain
*
Author to whom correspondence should be addressed.
Current address: Department of Biological Chemical Physics, Rocasolano Institute of Physical Chemistry CSIC, 28006 Madrid, Spain.
Microorganisms 2022, 10(2), 366; https://doi.org/10.3390/microorganisms10020366
Submission received: 17 December 2021 / Revised: 1 February 2022 / Accepted: 1 February 2022 / Published: 4 February 2022
(This article belongs to the Special Issue Plant Pathogenic Microorganisms: State-of-the-Art Research in Spain)

Abstract

:
In RNA viruses, which have high mutation—and fast evolutionary— rates, gene overlapping (i.e., genomic regions that encode more than one protein) is a major factor controlling mutational load and therefore the virus evolvability. Although DNA viruses use host high-fidelity polymerases for their replication, and therefore should have lower mutation rates, it has been shown that some of them have evolutionary rates comparable to those of RNA viruses. Notably, these viruses have large proportions of their genes with at least one overlapping instance. Hence, gene overlapping could be a modulator of virus evolution beyond the RNA world. To test this hypothesis, we use the genus Begomovirus of plant viruses as a model. Through comparative genomic approaches, we show that terminal gene overlapping decreases the rate of virus evolution, which is associated with lower frequency of both synonymous and nonsynonymous mutations. In contrast, terminal overlapping has little effect on the pace of virus evolution. Overall, our analyses support a role for gene overlapping in the evolution of begomoviruses and provide novel information on the factors that shape their genetic diversity.

1. Introduction

Genomic regions that encode more than one protein, that is, gene overlapping, are commonplace among viruses [1,2]. Such regions have important biological and evolutionary implications. First, they are associated with virus within-host multiplication, between-host transmission, disease severity and strength of host immune response [3,4,5,6]. Second, viruses are subjected to strong selection for maintaining smaller genomes because this (i) reduces the chances for deleterious mutations to become fixed in the virus genome, particularly in viruses with high mutation rates; (ii) improves virus fitness due to faster replication; and (iii) optimizes virion formation due to physical limitations imposed by the capsid size [7,8,9]. Gene overlapping allows increasing the amount of genomic information in viral genomes while controlling for limited capsid space and speeding up the purification of deleterious mutations from the virus population by amplifying their effect, as in overlapping regions these mutations affect more than one gene at the same time [1,9,10].
If gene overlapping is selectively advantageous for viruses, it would be expected to be more frequent: in RNA than in DNA viruses, as the former have (in general) higher mutation rates [11]; in larger than in shorter viral genomes to minimize the chances of deleterious mutations to become fixed [12], and in spherical virions as these generally have smaller inner volumes than other capsid shapes [13]. Although Brades and Linial [9] failed to detect an association between virion shape and frequency of gene overlapping in support of the predictions above, it has been shown that the larger the gene overlapping the greater the reduction in the rate of RNA virus evolution [1], and that gene overlapping appears to be more frequent in DNA viruses, which on average have also larger genome sizes [11], than in RNA viruses [2].
The species of the family Geminiviridae of plant viruses are notable exceptions to the virus characteristics associated with gene overlapping. Although geminiviruses are ssDNA viruses, and therefore replicate through high fidelity polymerases [14], and have small genomes, these viruses have a large proportion of their genes with at least one overlapping region [2]. The Geminiviridae family is currently divided into nine genera of which the largest one is the genus Begomovirus. This is also one of the most numerous genera of plant viruses with more than 400 species [15]. Bipartite begomovirus genomes encode six open reading frames (ORFs): two in the virion strand (AV1 and AV2) and four in the complementary strand (AC1, AC2, AC3 and AC4), with monopartite begomoviruses encoding equivalent proteins with the same names but without the A prefix. The AV1 gene encodes the coat protein (CP), which is essential for genome encapsidation, viral movement and insect transmission. The AV2 gene, which is considered as the pathogenicity gene, is also involved in movement and symptom development, and functions as a suppressor of gene silencing. This ORF is not present in New World bipartite begomoviruses. Viral DNA replication depends on the AC1 gene product (replication initiator protein, Rep). The AC2 gene encodes the transcriptional activator protein (TrAP) that interferes with transcriptional and post-transcriptional gene silencing (TGS and PTGS, respectively), and with the CP expression. The gene encoding for the AC3 protein (replication enhancer protein, REn), enhances viral DNA accumulation, and is involved in interaction with the plant-host retinoblastoma-related (RBR) proteins. Finally, AC4 counteracts PTGS by inhibiting accumulation of siRNA and is considered an important symptom determinant [15]. All of these six ORFs have at least one overlapping region in both mono- and bipartite begomoviruses [15]. Bipartite begomoviruses have two additional non-overlapping ORFs in the B-component: BC1, a movement protein (MP), and BV1, the nuclear shuttle protein (NSP) (Figure 1) [16].
Despite being DNA viruses, begomoviruses have been repeatedly shown to have high evolutionary rates (reviewed by [18]). For instance, Tomato yellow leaf curl virus (TYLCV) substitution rate has been estimated to be of 2.88 × 10−4 nucleotide substitutions per site per year, which is in the range of values for RNA viruses [12]. This fast evolutionary rate has been attributed to the effect of oxidative damage in replicated viral genomes, and/or to higher mutation rates than expected for DNA viruses [12]. If so, extensive gene overlapping in begomoviruses may contribute to modulate mutational load, and consequently the rate of virus evolution, as it has been shown for RNA viruses [1]. Experimental evidence supporting this idea is scarce and sometimes contradictory. For instance, a higher variability occurred in the Tomato yellow leaf curl China virus (TYLCCNV) AC1-AC4 overlapping (OV) region than in the non-overlapping (NOV) region of AC1 [19], whereas the opposite was observed for Pepper huasteco yellow vein virus (PHYVV) [20].
Here, we analyzed the effect of gene overlapping on the rate of begomovirus evolution through comparative genomics and utilizing sequences from 18 species. In particular, we explored whether the following evolutionary parameters vary between OV and NOV regions and among different types of gene overlap: (1) the rate of viral evolution, using overall tree length as a proxy, (2) the frequency of synonymous and nonsynonymous substitutions, (3) selection pressure and (4) magnitude of the effect of gene overlapping in the rate of virus evolution.

2. Materials and Methods

2.1. Sequence Data

Available sequences from begomovirus species were retrieved from GenBank. Sequences from extensively passaged isolates in non-natural hosts were excluded. When possible, we tried to minimize the presence of recombination. Species with more than 10 sequences were retained for analysis, so that we were able to include 18 mono and bipartite begomoviruses, and a total of 8239 sequences. Overall, we analyzed 125 instances of gene overlap ranging between 59 and 423 nt in length: 17 internal overlapping instances, 54 5′-terminal overlapping instances, and 54 3′-terminal overlapping instances. For simplicity, genes are named as for bipartite begomoviruses. Note that we divided sequences from Bhendi yellow vein mosaic virus (BYVMV) into two groups: one of sequences originally classified as belonging to this virus, and another originally characterized as Bhendi yellow vein India virus. Although both groups are currently considered as belonging to BYVMV [15], we chose to analyze them separately as evolutionary parameters differed between groups (Appendix A). However, analyses merging the two groups did not change our conclusions. We constructed sequence alignments for the 125 overlapping instances, and for the corresponding OV and NOV fragments of each gene. Sequence alignments of the OV regions were adjusted according to the amino acid sequence of each of the two genes involved, thus generating two data sets for each OV region. All alignments were built using MUSCLE 3.7 [21] and adjusted manually according to the amino acid sequences using AliView [22]. Alignments are available as Supplementary Material File S1.

2.2. Estimation of Tree Length

Tree lengths (t) were estimated for the OV and NOV regions of each gene. To do so, we used a maximum likelihood fitting of the General Time Reversible (GTR) nucleotide substitution model as implemented in the HyPhy package [23]. Differences in total tree length between OV and NOV regions were analyzed using a relative ratio test also utilizing HyPhy. Because t is dependent on the number of tree branches (i.e., number of sequences), when values were compared among overlapping instances, t was normalized according to the number of sequences.

2.3. Selection Pressures

Selection pressures for OV and NOV regions were estimated as the difference between the mean number of nonsynonymous (dN) and synonymous (dS) nucleotide substitutions per site (dN/dS) using the fast unbiased Bayesian approximation (FUBAR), and the fixed effect likelihood (FEL) methods implemented in HyPhy (Appendix A). Because the two methods yielded similar results, only the FUBAR results are shown here. In all cases, dN/dS measures were based on neighbor-joining trees inferred using the MG94 nucleotide substitution model. Significant differences between dN/dS values in OV and NOV regions, were analyzed using a population level adaptation test [24]. Values of dN and dS were also estimated. For each pair of overlapping genes, dN, dS, and dN/dS estimates were obtained for the two reading frames of the OV region. To do so, we used separated sequence alignments for the two overlapping genes, and we partitioned codons, such that OV and NOV regions could be defined over the full-length sequence of each gene.

2.4. Detection of Recombination

For each pair of overlapping genes, recombination breakpoints were detected using six different methods as implemented in RDP5: RDP, GENECONV, MaxChi, 3Seq, Bootscan, and Chimaera [25]. Only recombination signals detected by at least four methods (p < 0.05) were considered as positive. For the purpose of this work, recombinants with breakpoints in the LIR and the V1/C3 limit, which are recombination hotspots [26], were not counted as such as they were not differentially affecting OV and NOV regions of any given gene. Instances with more than 10% of recombinant sequences, regardless of breakpoints were located in OV or NOV regions, were considered to have excessive recombination (Appendix A). Analyses were repeated excluding such instances, but conclusions did not vary. Hence, we present here results obtained using all instances.

2.5. Statistical Analysis

The 125 overlapping instances were used for statistical analysis. Tree lengths (t) were not homoscedastic according to Kolmogorov–Smirnov and Levene’s tests. Therefore, this variable was fitted to a gamma distribution; whereas the ratio OV/NOV for t, dN, dS, and dN/dS, and percentage of overlapping were fitted to a normal distribution, according to Akaike’s Information Criteria (R package: RRISKDISTRIBUTIONS; [27]). Consequently, differences in values of these variables between OV and NOV regions and between types of gene overlap were analyzed by generalized linear models (GzLM), using type of region or type of overlapping as factors. Differences in the proportion of genes for which parameters above differed between OV and NOV regions was analyzed by Fisher’s exact test [28]. Associations between parameters were tested using Pearson´s correlation tests. All statistical analyses were performed using the statistical software packages SPSS 17.0 (SPSS Inc., Chicago, IL, USA) and R v.3.6.3 [29].

3. Results

3.1. Effect of the Presence and Type of Gene Overlapping on Gene Evolution

We analyzed the effect of gene overlapping on the rate of begomovirus evolution by estimating the total length of the tree (t) inferred for the OV and NOV regions of each gene (Figure 2). In OV regions, t ranged from 0.001 to 5.218, depending on the gene–virus combination, with mean value of 0.797 (median: 0.573). Variation in t for NOV regions ranged between 0.006 and 7.676, with mean value of 1.262 (median: 0.744). A GzLM analysis using type of region (OV and NOV) as a factor indicated that t was significantly smaller in OV than in NOV regions (Wald χ2 = 10.74; p = 1 × 10−3). In agreement with these results, in most overlapping instances, t was significantly smaller in OV than in NOV regions (93/125, χ2 = 59.54, p < 1 × 10−5) (Figure 2 and Appendix A). Hence, gene overlapping generally reduces evolutionary rates. However, viruses can generate different types of gene overlap, which arise by different mechanisms and that generally differ in the resulting frameshift [10] and the degree of selective independence of the genes involved [7]. Therefore, it could be hypothesized that evolutionary rates differ by type of gene overlap.
Thus, three types of gene overlap were defined following [1]: (1) internal overlapping, when one of the genes contains the complete sequence of the other; (2) 5′-terminal overlapping, when the OV region is in the 5′-terminal region of the gene; and (3) 3′-terminal overlapping, when the OV region is in the 3′-terminal region of the gene. Genes with terminal overlapping showed significantly lower t values in OV than in NOV regions (Wald χ2 ≥ 4.88; p ≤ 0.027), with most instances fitting this general observation (42/54, χ2 = 33.33, p < 1 × 10−5; and 39/54, χ2 = 21.33, p < 1 × 10−5, for 5′- and 3′-terminal overlapping, respectively). In contrast, in genes with internal overlapping no significant differences between OV and NOV regions were observed (Wald χ2 ≤ 1.99; p ≥ 0.212), and instances with lower t in OV regions were not significantly more frequent (11/17, χ2 = 2.94, p = 0.086) (Figure 2 and Appendix A). We also analyzed differences in the magnitude of the effect of each type of overlapping in reducing the rate of virus evolution. For that, we calculated the OV/NOV ratio for t values of overlapping instances where this parameter was significantly smaller in OV regions. A GzLM indicated that the magnitude of the effect on t depended on the type of overlapping (Wald χ2 = 3.71; p = 0.028), with terminal ones showing similar effects (p = 0.314) and in both cases higher than internal overlapping (p ≤ 0.041). Same conclusions were obtained when normalized t values were used.
Our dataset included mono- and bipartite begomoviruses, which differ in host-virus and virus-virus protein-protein interactions [30]. This may result in differential evolutionary constraints that may modulate how gene overlapping affects virus evolution. Thus, we analyzed whether gene overlapping influenced tree length depending on the begomovirus genome structure. GzLMs using this trait (mono- vs. bipartite) as a factor indicated that it had no effect on t differences between OV and NOV regions (Wald χ2 = 2.24; p = 0.137). In agreement, t was significantly higher in NOV than in OV regions when mono- and bipartite begomoviruses were analyzed separately (Wald χ2 = 4.30; p = 0.038 and Wald χ2 = 8.67; p = 3 × 10−3, respectively). In both groups of viruses, the same was observed when each type of terminal overlapping was analyzed separately (Wald χ2 ≥ 4.90; p ≤ 0.027), but not for internal overlapping (Wald χ2 ≤ 0.82; p ≥ 0.366). The proportion of instances with higher t in NOV that in OV regions was higher than expected by chance in terminal overlapping of both types of genome structures (χ2 ≥ 4.92, p ≤ 0.026), but not in internal overlapping (χ2 ≤ 0.98, p ≥ 0.173) (Appendix A).
In sum, these results indicate that the effect of gene overlapping on the rate of begomovirus evolution varies depending on its type; terminal overlapping generally reduces tree length, whereas no clear trend is observed in genes with internal overlapping. On the other hand, the type of genomic structure has little effect on the observed patterns.

3.2. Association between Selection Pressures and Gene Evolution

To further analyze how gene overlapping reduced the rate of evolution, we estimated selection pressures (dN/dS) and individual dN and dS values for the OV and NOV regions of each gene (Figure 3 and Appendix A). Average dN/dS values were 0.35 ± 0.03 and 0.50 ± 0.04 for OV and NOV regions, respectively. A GzLM using type of region as factor indicated that negative selection pressures were significantly stronger in OV than in NOV regions (Wald χ2 = 11.42; p < 1 × 10−5), and we obtained similar results when each type of overlap was analyzed independently (Wald χ2 ≥ 8.18; p ≤ 2 x10−4, Figure 3 and Appendix A). In agreement, most genes had significantly higher dN/dS in NOV than in OV fragments when all types of overlap were considered together (89/125, χ2 = 44.94, p < 1 × 10−5) and for the three of them independently (14/17, χ2 = 14.24, p = 1.6 × 10−4; 43/54, χ2 = 37.93, p < 1 × 10−5; 33/54, χ2 = 5.33, p = 0.021, for internal, 5′-, and 3′-overlapping, respectively) (Figure 3 and Appendix A).
Similar analysis for dN indicated significantly lower values in OV than in NOV regions when all genes were considered together (0.24 ± 0.03 and 0.39 ± 0.04, respectively; Wald χ2 = 12.10; p < 1 × 10−5), and when each type of overlapping was analyzed separately (Wald χ2 ≥ 9.21; p ≤ 5 × 10−3). Also, in most instances, dN followed this trend (92/125, χ2 = 55.70, p < 1 × 10−5), with similar results for each type of overlap (13/17, χ2 = 7.53, p = 6.1 × 10−3; 42/54, χ2 = 33.33, p < 1 × 10−5; 37/54, χ2 = 14.81, p = 1.2 × 10−4, for internal, 5′-, and 3′-overlapping, respectively) (Figure 3). Finally, dS was similar in NOV and in OV regions either considering all genes together (0.72 ± 0.08 and 0.83 ± 0.07, respectively; Wald χ2 = 2.16; p = 0.079) or analyzing each type of overlap independently (Wald χ2 ≤ 3.18; p ≥ 0.101) (Figure 3). However, instances with dS value higher in NOV than in OV regions were more frequent than expected by chance (86/125, χ2 = 35.34, p < 1 × 10−5), with similar results for each type of overlap (12/17, χ2 = 5.76, p = 0.016; 35/54, χ2 = 9.48, p = 2.1 × 10−3; 39/54, χ2 = 21.33, p < 1 × 10−5, for internal, 5′-, and 3′-overlapping, respectively) (Figure 3).
When mono- and bipartite begomoviruses were analyzed separately, dN/dS and dN (Wald χ2 ≥ 7.70; p ≤ 6 × 10−3, Wald χ2 ≥ 4.18; p ≤ 0.041, respectively), but not dS (Wald χ2≤ 3.00; p ≥ 0.083), were always higher in NOV than in OV regions for viruses with both genomic structures. When each type of overlapping was analyzed separately, similar results were obtained for dN/dS and dN (Wald χ2 ≥ 4.11; p ≤ 0.043, Wald χ2 ≥ 6.61; p ≤ 0.010 and Wald χ2 ≥ 3.60; p ≤ 0.050, for internal, 5′-, and 3′-overlapping, respectively), and for dS (Wald χ2 ≥ 0.20; p ≤ 0.652, Wald χ2 ≥ 1.20; p ≤ 0.274 and Wald χ2 ≥ 2.66; p ≤ 0.103, for internal, 5′-, and 3′-overlapping, respectively). As above, the proportion of overlapping instances with higher dN/dS, dN and dS in NOV than in OV regions was generally larger than those showing the opposite trend in viruses with both types of genome structure and in all types of gene overlapping (χ2 ≥ 3.63, p ≤ 0.050) (Appendix A).
Thus, overlapping genes are generally subjected to stronger purifying selection in OV than in NOV fragments, which seems to be associated with a greater constraint against non-synonymous changes regardless of the type of overlap and, to a lesser extent, with constraints to synonymous changes. Again, the type of genomic structure had no influence in the observed results.

3.3. Association between Proportion of Overlap and Gene Evolution

For RNA viruses, it has been shown that the lengths of the OV region relative to gene length are negatively correlated with these rates in a non-linear manner [1,31]. We analyzed whether this relationship held for begomoviruses by calculating the normalized tree length for the complete sequence of each gene and assessing the strength of association between t and the proportion of gene overlap (Figure 4). As the genome structure had no effect in previous analyses, we did not consider this trait here. On the other hand, we included normalized tree lengths for AC4 (100% overlap), which were not considered previously as in this gene no OV vs. NOV comparison was possible.
The proportion of gene overlap (%) differed among types of overlap (Wald χ2 = 8.74; p < 1 × 10−4): it was lower in genes with internal (35.67 ± 0.97) than terminal overlapping (60.59 ± 3.65 and 59.90 ± 3.18 for 5′- and 3′-terminal overlapping, respectively). Hence, we analyzed the association between per cent of gene overlap and t in the complete sequence of each gene for all genes together and for each type of overlap separately. We performed bivariate analysis considering linear and nonlinear regressions. When a significant association was found, it was best explained by a negative logarithmic relationship between the length of overlap and t (Figure 4). Bivariate analysis revealed a significant negative logarithmic association between these two variables when all instances were considered together (r = −0.33; p < 1 × 10−4; Figure 4), with similar results when excluding values for AC4 (r = −0.32; p < 1 × 10−4). We also found a significant negative logarithmic association in both types of terminal overlap (r = −0.37, p = 9 × 10−3 and r = −0.31, p = 0.027; for 5′-, and 3′-overlapping, respectively), but not for internal ones with (r = −0.25, p = 0.191; Figure 4) and without (r = 0.23, p = 0.383) AC4 values. Comparable results were obtained using only those genes for which t values were higher in NOV than in OV regions.

4. Discussion

Several non-mutually exclusive theories have been proposed to explain the abundance of gene overlapping in viruses: (i) it has a role in gene regulation by providing an inherent mechanism for coordinated expression [7]; (ii) it is an effective mechanism for generating novel genes while keeping genome size minimized, by introducing a new reading frame on top of an existing one [32,33]; or (iii) as mutations in these regions affect more than one gene, gene overlapping amplifies the deleterious effect of mutations, thus quickly eliminating such mutations from the viral population, particularly in RNA viruses which have higher mutation rates [7,34,35]. Although there is general agreement on the role of gene overlapping in maintaining genomic compression [10,31,36,37], its effect on virus evolutionary rates remains more elusive [1,2]. This is particularly so for DNA viruses that despite having in general lower mutation rates than RNA viruses have in some cases larger proportion of their genes with at least one overlapping instance [2]. Here, we analyzed whether in the largest genus of plant DNA viruses, whose genome is enriched in gene overlapping instances, this feature modulates the rate of gene evolution.
Our comparative genomic analyses in species of the genus Begomovirus indicate that tree length (as a proxy of the rate of evolution) was generally smaller in OV than in NOV regions, with most overlapping instances following this rule. This agrees with the predictions of mathematical models [34,38,39]. Interestingly, these models also predict that the reduction in evolutionary rate is the consequence of correlations at overlapping sites, which are stronger in positions where a mutation would result in a nonsynonymous change in both overlapping genes than in positions where mutations are synonymous in one gene and nonsynonymous in the other [7,34,38]. This may explain why our results indicate that the reduction in the genetic diversity of OV regions is associated with decreased dN, but not dS although in most instances OV regions had lower values of both parameters: gene overlapping would influence both synonymous and nonsynonymous substitution, but this effect would be stronger in nonsynonymous ones. There was significant negative (logarithmic) correlation between the length of overlap and the genetic diversity of each gene; that is, the longer the OV region, the lower the evolutionary rate. This agrees with theoretical models, which predict that evolutionary rate is expected to decline nonlinearly with increasing overlap [7]. This negative logarithmic association also indicates that an increased proportion of gene overlapping reduces begomovirus evolutionary rates up to a threshold, beyond which larger overlapping has no effect on tree length. Thus, long overlapping regions cannot be fully explained by their effect on evolutionary rates alone, and other selection pressures, such as genome compression or coordinated gene expression are likely to play a role.
Altogether, our results provide compelling evidence supporting the role of gene overlapping in reducing the rate of Begomovirus evolution. This observation is in accordance with previous reports for a variety of RNA viruses [1,40,41,42]. In most of these cases, the reduction of the rate of virus evolution associated with gene overlapping has been attributed to the need of these viruses to buffer excessive mutational load due to high mutation rates. To date, however, estimates of mutation rates in DNA viruses suggest that these are lower than for RNA viruses [11]. Two lines of evidence suggest that this might not be the case for begomoviruses. First, rough estimates of mutation frequency in TYLCCNV showed values around 1 × 10−4 [19], which is comparable to the variation reported for plant RNA viruses and higher than for other ssDNA viruses [11,43]. Second, it has been shown that some of the DNA polymerases involved in begomovirus replication are error-prone in conditions equivalent to those in which they amplify the viral genome [44,45]. Hence, begomoviruses could have evolved overlapping regions as a safety mechanism to control high mutation rates.
Evolutionary constraints imposed by gene overlapping are a double-edged sword. They restrict the fixation of deleterious mutations; but at the same time, they leave little room to increase virus fitness, as beneficial mutations in one gene are often deleterious in the other and are therefore purged [1,4]. Viruses are faced with the need to reconcile these two facets such that they limit the fixation of unfit mutations but allow generation of beneficial genetic diversity. To do so, it has been shown that viruses may use a “segregated” organization in which overlapped regions harbor functional domains of one gene or the other, but never both [4]. Thus, gene overlapping imposes a certain degree of evolutionary constraint, as mutations affect more than one gene at the same time. However, this is not as strong as if both genes would harbor functional domains in the overlapping region, or as relaxed as if both genes would not overlap. This strategy results in higher fitness peaks than in the absence of gene overlapping [4]. Interestingly, some evidence suggests that begomoviruses may use a similar strategy. For instance, AV1 functional domains involved in DNA shuttle into the nucleus or in vector transmission are located at the N-terminal region of the protein, overlapping with AV2 [46]; whereas hydrophobic domains involved in the silencing suppression activity of AV2 locate at the NOV region of this protein [47]. Similarly, in AC2 the domain responsible for repressing AV1 expression is in the NOV region of this gene [48], whereas the OV region of AC3 is rich in functional domains [47].
Despite the general trend toward a reduction of genetic diversity in OV compared with NOV regions, when each type of overlapping was analyzed separately, this effect remained significant only in instances with 5′- and 3′-terminal overlap, whereas nearly one-third of the instances with internal overlap showed the opposite trend. Different types of overlapping vary in the preponderance of the associated frameshifting [10]. However, in begomoviruses all overlapping instances have +1/−1 frameshift, which are identical in the extent to which they allow selective independence of the overlapping genes [7]. Alternatively, in our dataset we included mono- and bipartite begomoviruses, for which different functions have been attributed to the C4/AC4 proteins [30]. Different selective pressures on C4/AC4 depending on its function may impose different constraints on its evolution, modulating the buffering effects of gene overlapping on the accumulation of mutations on AC1. We do not favor this hypothesis as our results indicate that genomic structure has little effect on the role of gene overlapping as modulator of begomovirus evolution. Another possible explanation for the observed differences is that, as we restricted our analyses to a single virus genus and each type of overlap occurs in the same genes across species, differences between terminal and internal overlapping reflect particular characteristics of the genes involved. Indeed, internal overlapping instances involved the same two genes (AC1 and AC4) in all species. If, for instance, the AC1 gene is dominating the evolution of AC4, as has been shown for younger overlapping genes generated by overprinting over older ones [33,49], the resulting internal overlapping would have less effect in the evolution of AC1, in accordance with our results. In addition, note that AC1 is involved in virus replication, which is a key component of virus fitness, thus this gene is more likely to drive AC4 evolution rather than the other way around. In support of this hypothesis, it has been shown that AC1 is under strong negative selection, whereas AC4 is under positive selection [50]. Finally, at odds with the examples mentioned above, functional conserved domains are not segregated in AC1/AC4 [47,51], which would also support that the observed differences respond to gene-specific features.
An additional source of gene-specific heterogeneity in our dataset that could explain the differential effect of internal and terminal overlapping in begomovirus evolution is the presence of recombination. Large fragments of AC1 (including the region overlapping with AC4) are recombination hotspots, whereas AC2/AC3 and big portions of AV1 and AV2 are coldspots [26,52]. It has been hypothesized that recombination allows removing deleterious mutations with high efficiency, as reviewed by [53]. Hence, the limited effect of AC1/AC4 internal overlapping in virus evolutionary rates could be explained by a higher frequency of recombination in AC1, which in NOV regions would have similar consequences than gene overlapping. Although we cannot completely discard such a role of recombination, at least in our dataset several observations argue against it. First, the percentage of instances with over 10% of recombinant sequences was evenly distributed across types of overlapping (31–35%, Appendix A). If recombination in AC1/AC4 were to explain our results, we would have expected more frequent recombination in internal than in terminal overlapping. Rather, virus species identity seemed to explain most of the variation in recombination frequency, with three species (Bhendi yellow vein mosaic virus, Chilli leaf curl virus and Okra enation leaf curl virus) accounting for two thirds of the overlapping instances with excessive recombination. Second, when these instances (41/125) were removed from the analyses, we still observed higher t values in NOV than in OV regions in terminal (30/37 and 27/36 instances, for 5′-, and 3′-terminal overlapping, respectively), but not in internal (5/10 instances), overlapping. Hence, our conclusions hold regardless of the presence of extensive recombination.
Some cautionary comments on our results are called for, however. First, the number of instances among types of overlapping is not fully balanced, with lower numbers for internal than for both types of terminal overlap. Hence, the lack of a significant effect of OV regions in internal overlap could be due to reduced sample size. Second, because we restricted our analyses to a single virus genus, overlapping instances occur in the same genes across species, which may reduce the range of overlapping lengths included in the regression analyses with the subsequent reduction of statistical power. However, the range of terminal overlapping lengths included was enough to detect a significant correlation between percent of gene overlapping and genetic diversity. This range was much smaller for internal overlapping, which again may explain the lack of association between the two analyzed traits. Finally, we could include only 18 out of the 420 begomovirus species, as these were the only ones that fulfilled the criteria to be included in our analyses. Despite the small sample size, our results indicate strongly significant effects, which support the relevant role of gene overlapping in begomovirus evolution.
In sum, this work provides novel evidence of the selective constraints imposed by gene overlapping on the pace of begomovirus evolution. Whether this effect is general for DNA viruses would be an interesting avenue of future research.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/microorganisms10020366/s1, File S1: Sequence alignments used in this work.

Author Contributions

Conceptualization, I.P.; methodology, I.P.; formal analysis, I.M.-H.; data curation, I.M.-H.; writing—original draft preparation, I.P.; writing—review and editing, I.P.; funding acquisition, I.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Plan Nacional I + D + i, Ministerio de Economía y Competi- tividad (Agencia Estatal de Investigación), Spain [PID2019-109579RB-I00] to IP.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available as Supplementary Material and as Appendix A.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A

Table A1. Statistical parameters of traits associated with the rate of evolutionary change in overlapping genes.
Table A1. Statistical parameters of traits associated with the rate of evolutionary change in overlapping genes.
Species 1N 2Overlapping 3Gene Length 4t5dN/dS
ORFLength NOVOVNOVOV
Internal (11/17)
African cassava mosaic virus (B)27 (4%)AC1-AC442310770.530.410.390.11
Alternanthera yellow vein virus (M)11 (8%)AC1-AC429110860.470.640.540.33
Bean golden mosaic virus (B)121 (0%)AC1-AC425810860.440.860.340.06
Bhendi yellow vein mosaic virus* (M)39 (98%)AC1-AC429410921.250.850.480.09
Bhendi yellow vein mosaic virus (M)32 (99%)AC1-AC430310921.901.350.110.08
Chilli leaf curl virus (M)16 (28%)AC1-AC430010860.860.830.220.20
Cotton leaf curl Gezira virus (M)21 (9%)AC1-AC429410890.220.890.370.02
Cotton leaf curl Multan virus (M)50 (2%)AC1-AC430310920.670.210.070.02
East African cassava mosaic Kenya virus (B)50 (0%)AC1-AC429710650.270.310.050.07
East African cassava mosaic Malawi virus (B)13 (0%)AC1-AC423410800.070.060.200.19
East African cassava mosaic virus (B)153 (2%)AC1-AC423410801.551.150.410.36
East African cassava mosaic Zanzibar virus (B)14 (29%)AC1-AC425810800.350.090.080.03
 Okra enation leaf curl virus (M)60 (92%)AC1-AC430810892.961.480.290.24
 South African cassava mosaic virus (B)125 (0%)AC1-AC429710800.550.790.090.14
 Sweet potato leaf curl virus (M)17 (41%)AC1-AC425810950.790.990.120.28
 Tomato leaf curl New Delhi virus (B)97 (3%)AC1-AC430310863.753.340.430.39
 Tomato yellow leaf curl virus (M)397 (68%)AC1-AC429410742.431.600.700.05
5′-terminal (42/54)
African cassava mosaic virus (B)32 (3%)AC1-AC2934080.210.370.110.20
31 (3%)AC2-AC32604050.310.030.500.03
26 (0%)AV1-AV21937770.390.260.370.15
Alternanthera yellow vein virus (M)11 (18%)AC1-AC2984060.220.010.220.02
13 (0%)AC2-AC32604050.710.270.110.03
10 (60%)AV1-AV21897710.330.761.150.21
Bean golden mosaic virus (B)158 (0%)AC1-AC2893901.300.290.400.13
158 (0%)AC2-AC32543991.500.250.260.05
Bhendi yellow vein mosaic virus* (M)51 (71%)AC1-AC21044531.911.680.330.29
51 (20%)AC2-AC33084053.440.740.280.14
50 (40%)AV1-AV22067711.760.850.630.14
Bhendi yellow vein mosaic virus (M)57 (54%)AC1-AC21044532.422.821.010.58
57 (2%)AC2-AC33084051.800.561.180.56
56 (16%)AV1-AV22067711.630.610.380.21
Chilli leaf curl virus (M)18 (33%)AC1-AC2984050.990.300.170.02
23 (17%)AC2-AC32604053.080.790.470.25
22 (18%)AV1-AV21977711.221.330.180.24
Cotton leaf curl Gezira virus (M)32 (6%)AC1-AC21014050.010.330.020.52
32 (0%)AC2-AC32574020.470.040.420.07
31 (0%)AV1-AV22097770.350.370.160.24
Cotton leaf curl Multan virus (M)58 (2%)AC1-AC21044530.590.650.120.14
59 (2%)AC2-AC33084051.080.801.110.78
59 (15%)AV1-AV22067711.040.750.440.37
East African cassava mosaic Kenya virus (B)71 (0%)AC1-AC2774080.400.290.250.08
71 (0%)AC2-AC32604050.340.220.180.06
64 (0%)AV1-AV21977740.230.180.550.10
East African cassava mosaic Malawi virus (B)9 (0%)AC1-AC2924080.000.00--
10 (0%)AC2-AC32604050.060.030.570.14
12 (0%)AV1-AV21917770.050.080.170.22
East African cassava mosaic virus (B)166 (2%)AC1-AC2924051.790.972.470.27
162 (0%)AC2-AC32604051.380.600.670.05
105 (0%)AV1-AV21977750.790.310.580.09
East African cassava mosaic Zanzibar virus (B)15 (27%)AC1-AC2924080.230.010.500.15
15 (0%)AC2-AC32604050.110.101.760.42
15 (0%)AV1-AV21977750.200.070.630.27
Okra enation leaf curl virus (M)67 (97%)AC1-AC21044533.453.180.510.23
68 (32%)AC2-AC33084051.040.631.440.17
41 (58%)AV1-AV21887710.641.260.240.62
Pepper golden mosaic virus (B)54 (57%)AC1-AC2593901.490.660.470.12
54 (0%)AC2-AC32543990.940.890.340.94
Pepper huasteco yellow vein virus (B)19 (5%)AC1-AC2804171.010.010.380.28
45 (0%)AC2-AC32543990.740.421.990.32
South African cassava mosaic virus (B)131 (0%)AC1-AC2924080.560.770.471.52
130 (0%)AC2-AC32604050.670.480.400.64
126 (4%)AV1-AV21917770.730.641.170.65
Sweet potato leaf curl virus (M)18 (39%)AC1-AC2924520.541.680.390.33
14 (7%)AC2-AC32784352.960.731.780.25
18 (9%)AV1-AV21767650.520.170.870.10
Tomato leaf curl New Delhi virus (B)88 (4%)AC1-AC2984202.831.470.710.62
113 (1%)AC2-AC32814115.991.560.570.46
115 (0%)AV1-AV21797712.902.121.160.77
Tomato yellow leaf curl virus (M)588 (9%)AC1-AC2924087.685.220.960.43
521 (10%)AC2-AC32604056.373.080.950.64
593 (10%)AV1-AV21917773.473.160.890.42
3′-terminal (39/54)
African cassava mosaic virus (B)27 (3%)AC1-AC29310770.530.360.500.29
32 (3%)AC2-AC32604080.210.030.440.20
33 (0%)AV1-AV21933420.540.240.470.18
Alternanthera yellow vein virus (M)11 (18%)AC1-AC29810860.470.010.480.02
11 (0%)AC2-AC32604060.220.270.450.57
12 (60%)AV1-AV21893481.210.760.570.13
Bean golden mosaic virus (B)121 (0%)AC1-AC28910860.440.100.560.34
158 (0%)AC2-AC32543900.300.280.650.61
Bhendi yellow vein mosaic virus* (M)39 (92%)AC1-AC210410921.250.910.580.38
51 (20%)AC2-AC33084531.910.910.270.03
50 (40%)AV1-AV22063661.340.460.130.10
Bhendi yellow vein mosaic virus (M)23 (98%)AC1-AC210410921.900.880.160.70
57 (2%)AC2-AC33084532.420.570.210.51
55 (16%)AV1-AV22063660.890.320.780.54
Chilli leaf curl virus (M)16 (38%)AC1-AC29810860.860.100.300.12
18 (17%)AC2-AC32604050.990.500.660.57
16 (19%)AV1-AV21973570.260.451.762.67
Cotton leaf curl Gezira virus (M)21 (9%)AC1-AC210110890.220.330.070.33
32 (0%)AC2-AC32574050.010.030.240.47
29 (0%)AV1-AV22093690.080.500.070.63
Cotton leaf curl Multan virus (M)50 (2%)AC1-AC210410920.670.380.270.03
58 (2%)AC2-AC33084530.590.750.110.14
57 (16%)AV1-AV22063660.450.460.240.26
East African cassava mosaic Kenya virus (B)50 (0%)AC1-AC27710650.270.010.380.25
71 (0%)AC2-AC32604080.400.250.250.61
56 (0%)AV1-AV21973570.160.180.420.45
East African cassava mosaic Malawi virus (B)13 (0%)AC1-AC29210800.070.000.160.00
9 (0%)AC2-AC32604080.000.00--
12 (0%)AV1-AV21913510.060.080.340.39
East African cassava mosaic virus (B)153 (3%)AC1-AC29210801.550.620.430.25
166 (0%)AC2-AC32604051.790.620.430.35
136 (0%)AV1-AV21973570.750.320.360.22
East African cassava mosaic Zanzibar virus (B)14 (29%)AC1-AC29210800.350.010.320.17
15 (0%)AC2-AC32604080.230.100.770.28
13 (0%)AV1-AV21973570.150.070.650.47
Okra enation leaf curl virus (M)60 (92%)AC1-AC210410892.964.660.050.25
67 (33%)AC2-AC33084533.450.680.320.04
45 (53%)AV1-AV21883480.380.750.210.50
Pepper golden mosaic virus (B)54 (57%)AC1-AC25910501.470.680.250.01
54 (0%)AC2-AC32543901.490.920.440.29
Pepper huasteco yellow vein virus (B)44 (2%)AC1-AC28010500.810.120.260.11
19 (0%)AC2-AC32544171.010.211.401.03
South African cassava mosaic virus (B)125 (0%)AC1-AC29210800.550.601.041.47
131 (0%)AC2-AC32604080.560.420.700.43
128 (4%)AV1-AV21913511.160.990.240.08
Sweet potato leaf curl virus (M)17 (41%)AC1-AC29210950.791.610.571.50
18 (6%)AC2-AC32784520.541.210.701.16
18 (9%)AV1-AV21763450.620.160.180.06
Tomato leaf curl New Delhi virus (B)97 (3%)AC1-AC29810863.751.860.160.13
88 (1%)AC2-AC32814202.830.880.120.84
105 (0%)AV1-AV21793392.781.580.710.52
Tomato yellow leaf curl virus (M)397 (8%)AC1-AC29210742.432.230.401.52
588 (9%)AC2-AC32604087.674.270.200.87
623 (9%)AV1-AV21913513.012.270.400.18
1 Number of genes with significant differences in tree length between OV and NOV regions over total number. Deviation from randomness was tested by Fisher’s exact test (p < 0.05). M: Monopartite; B: Bipartite. Asterisks indicate sequences formerly classified as Bendhi yellow vein India virus; 2 Number of sequences used. Percentage of recombinant sequences is shown in parentheses; 3 Name of the genes involved in the overlapping instance, and length of the OV region; 4 Length of the largest gene (internal overlapping) of the gene with 5′-terminal overlapping (5′-terminal) and of the gene with 3′-terminal overlapping (3′-terminal); 5 Tree length of inferred phylogenies for OV and NOV regions of each overlapping instance.

References

  1. Simon-Loriere, E.; Holmes, E.C.; Pagán, I. The effect of gene overlapping on the rate of RNA virus evolution. Mol. Biol. Evol. 2013, 30, 1916–1928. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Schlub, T.E.; Holmes, E.C. Properties and abundance of overlapping genes in viruses. Virus Evol. 2020, 6, veaa009. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Gogarten, J.P.; Townsend, J.P. Horizontal Gene Transfer, Genome Innovation and Evolution. Nat. Rev. Microbiol. 2005, 3, 679–687. [Google Scholar] [CrossRef] [PubMed]
  4. Fernandes, J.D.; Faust, T.B.; Strauli, N.B.; Smith, C.; Crosby, D.C.; Nakamura, R.L.; Hernandez, R.D.; Frankel, A.D. Functional Segregation of Overlapping Genes in HIV. Cell 2016, 167, 1762–1773. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Nelson, C.W.; Ardern, Z.; Goldberg, T.L.; Meng, C.; Kuo, C.-H.; Ludwig, C.; Kolokotronis, S.-O.; Wei, X. Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic. eLife 2020, 9, e59633. [Google Scholar] [CrossRef] [PubMed]
  6. Wright, B.W.; Ruan, J.; Molloy, M.P.; Jaschke, P.R. Genome modularization reveals overlapped gene topology is necessary for efficient viral reproduction. ACS Synth. Biol. 2020, 9, 3079–3090. [Google Scholar] [CrossRef]
  7. Krakauer, D.C. Stability and evolution of overlapping genes. Evolution 2000, 54, 731–739. [Google Scholar] [CrossRef]
  8. Belshaw, R.; Gardner, A.; Rambaut, A.; Pybus, O.G. Pacing a small cage: Mutation and RNA viruses. Trends Ecol. Evol. 2008, 23, 188–193. [Google Scholar] [CrossRef]
  9. Brandes, N.; Linial, M. Gene overlapping and size constraints in the viral world. Biol. Direct 2016, 11, 26. [Google Scholar] [CrossRef] [Green Version]
  10. Belshaw, R.; Pybus, O.G.; Rambaut, A. The evolution of genome compression and genomic novelty in RNA viruses. Genome Res. 2007, 17, 1496–1504. [Google Scholar] [CrossRef] [Green Version]
  11. Sanjuán, R.; Domingo-Calap, P. Mechanisms of viral mutation. Cell Mol. Life Sci. 2016, 73, 4433–4448. [Google Scholar] [CrossRef] [Green Version]
  12. Duffy, S.; Shackelton, L.A.; Holmes, E.C. Rates of evolutionary change in viruses: Patterns and determinants. Nat. Rev. Genet. 2008, 9, 267–276. [Google Scholar] [CrossRef] [PubMed]
  13. Cui, J.; Schlub, T.E.; Holmes, E.C. An Allometric Relationship between the Genome Length and Virion Volume of Viruses. J. Virol. 2014, 88, 6403–6410. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Wu, M.; Wei, H.; Tan, H.; Pan, S.; Liu, Q.; Bejarano, E.R.; Lozano-Durán, R. Plant DNA polymerases α and δ mediate replication of geminiviruses. Nat. Commun. 2021, 12, 2780. [Google Scholar] [CrossRef] [PubMed]
  15. Fiallo-Olivé, E.; Lett, J.-M.; Martin, D.P.; Roumagnac, P.; Varsani, A.; Zerbini, F.M.; Navas-Castillo, J. ICTV Report Consortium. ICTV Virus Taxonomy Profile: Geminiviridae 2021. J. Gen. Virol. 2021, 102, 001696. [Google Scholar] [CrossRef] [PubMed]
  16. Hanley-Bowdoin, L.; Bejarano, E.R.; Robertson, D.; Mansoor, S. Geminiviruses: Masters at redirecting and reprogramming plant processes. Nat. Rev. Microbiol. 2013, 11, 777–788. [Google Scholar] [CrossRef]
  17. Zerbini, F.M.; Briddon, R.W.; Idris, A.; Martin, D.P.; Moriones, E.; Navas-Castillo, J.; Rivera-Bustamante, R.; Roumagnac, P.; Varsani, A.; ICTV Report Consortium. ICTV Virus Taxonomy Profile: Geminiviridae. J. Gen. Virol. 2017, 98, 131–133. [Google Scholar] [CrossRef]
  18. Pagán, I.; García-Arenal, F. Population Genomics of Plant Viruses. In Population Genomics: Microorganisms; Polz, M.F., Rajora, O.P., Eds.; Springer International Publishing: New York, NY, USA, 2018. [Google Scholar]
  19. Ge, L.; Zhang, J.; Zhou, X.; Li, H. Genetic Structure and Population Variability of Tomato Yellow Leaf Curl China Virus. J. Virol. 2007, 81, 5902–5907. [Google Scholar] [CrossRef] [Green Version]
  20. Rodelo-Urrego, M.; García-Arenal, F.; Pagán, I. The effect of ecosystem biodiversity on virus genetic diversity depends on virus species: A study of chiltepin-infecting begomoviruses in Mexico. Virus Evol. 2015, 1, vev004. [Google Scholar] [CrossRef] [Green Version]
  21. Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef] [Green Version]
  22. Larsson, A. AliView: A fast and lightweight alignment viewer and editor for large data sets. Bioinformatics 2014, 30, 3276–3278. [Google Scholar] [CrossRef] [PubMed]
  23. Kosakovsky Pond, S.L.; Frost, S.D. Datamonkey: Rapid detection of selective pressure on individual sites of codon alignments. Bioinformatics 2005, 21, 2531–2533. [Google Scholar] [CrossRef] [PubMed]
  24. Kosakovsky Pond, S.L.; Frost, S.D.; Grossman, Z.; Gravenor, M.B.; Richman, D.D.; Leigh Brown, A.J. Adaptation to different human populations by HIV-1 revealed by codon-based analyses. PLoS Comp. Biol. 2006, 2, e62. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Martin, D.P.; Varsani, A.; Roumagnac, P.; Botha, G.; Maslamoney, S.; Schwab, T.; Kelz, Z.; Kumar, V.; Murrell, B. RDP5: A computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evol. 2020, 7, veaa087. [Google Scholar] [CrossRef] [PubMed]
  26. Lefeuvre, P.; Lett, J.-M.; Reynaud, B.; Martin, D.P. Avoidance of protein fold disruption in natural virus recombinants. PLoS Pathog 2007, 3, e181. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Belgorodski, N.; Greiner, M.; Tolksdorf, K.; Schueller, K.; rriskDistributions: Fitting Distributions to Given Data or Known Quantiles. R Package v.2.1. Available online: https://CRAN.R-project.org/package=rriskDistributions (accessed on 16 December 2021).
  28. Sokal, R.R.; Rohlf, F.J. Biometry: The Principles and Practices of Statistics in Biological Research; W.H. Freeman & Company: New York, NY, USA, 1995. [Google Scholar]
  29. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: https://www.R-project.org/ (accessed on 16 December 2021).
  30. Luna, A.P.; Lozano-Durán, R. Geminivirus-Encoded Proteins: Not All Positional Homologs Are Made Equal. Front. Microbiol. 2020, 11, 878. [Google Scholar] [CrossRef]
  31. Chirico, N.; Vianelli, A.; Belshaw, R. Why genes overlap in viruses. Proc. Biol. Sci. 2010, 277, 3809–3817. [Google Scholar] [CrossRef] [Green Version]
  32. Rancurel, C.; Khosravi, M.; Dunker, A.K.; Romero, P.R.; Karlin, D. Overlapping Genes Produce Proteins with Unusual Sequence Properties and Offer Insight into De Novo Protein Creation. J. Virol. 2009, 83, 10719–10736. [Google Scholar] [CrossRef] [Green Version]
  33. Sabath, N.; Wagner, A.; Karlin, D. Evolution of Viral Proteins Originated De Novo by Overprinting. Mol. Biol. Evol. 2012, 29, 3767–3780. [Google Scholar] [CrossRef] [Green Version]
  34. Krakauer, D.C. Evolutionary principles of genomic compression. Comments Theor. Biol. 2002, 7, 215–236. [Google Scholar] [CrossRef]
  35. Elena, S.F.; Carrasco, P.; Darós, J.A.; Sanjuán, R. Mechanisms of genetic robustness. EMBO Rep. 2006, 7, 168–173. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Holmes, E.C. Error thresholds and the constraints to RNA virus evolution. Trends Microbiol. 2003, 11, 543–546. [Google Scholar] [CrossRef] [PubMed]
  37. Lillo, F.; Krakauer, D.C. A statistical analysis of the three-fold evolution of genomic compression through frame overlaps in prokaryotes. Biol. Direct 2007, 2, 22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. Krakauer, D.C.; Plotkin, J.B. Redundancy, antiredundancy, and the robustness of genomes. Proc. Natl. Acad. Sci. USA 2002, 99, 1405–1409. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  39. Peleg, O.; Kirzhner, V.; Trifonov, E.; Bolshoy, A. Overlapping messages and survivability. J. Mol. Evol. 2004, 59, 520–527. [Google Scholar] [CrossRef]
  40. Hughes, A.L.; Westover, K.; da Silva, J.; O’Connor, D.H.; Watkins, D.I. Simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus. J. Virol. 2001, 75, 7966–7972. [Google Scholar] [CrossRef] [Green Version]
  41. Guyader, S.; Ducray, D.G. Sequence analysis of potato leafroll virus isolates reveals genetic stability, major evolutionary events and differential selection pressure between overlapping reading frame products. J. Gen. Virol. 2002, 83, 1799–1807. [Google Scholar] [CrossRef]
  42. Pagán, I.; Holmes, E.C. Long-term evolution of the Luteoviridae: Time scale and mode of virus speciation. J. Virol. 2010, 84, 6177–6187. [Google Scholar] [CrossRef] [Green Version]
  43. Schneider, W.L.; Roossinck, M.J. Genetic diversity in RNA viral quasispecies is controlled by host-virus interactions. J. Virol. 2001, 75, 6566–6571. [Google Scholar] [CrossRef] [Green Version]
  44. Deem, A.; Keszthelyi, A.; Blackgrove, T.; Vayl, A.; Coffey, B.; Mathur, R.; Chabes, A.; Malkova, A. Break-induced replication is highly inaccurate. PLoS Biol. 2011, 9, e1000594. [Google Scholar] [CrossRef] [Green Version]
  45. Hicks, W.M.; Kim, M.; Haber, J.E. Increased mutagenesis and unique mutation signature associated with mitotic gene conversion. Science 2010, 329, 82–85. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Fondong, V.N. Geminivirus protein structure and function. Mol. Plant. Pathol. 2013, 14, 635–649. [Google Scholar] [CrossRef] [PubMed]
  47. Luna, A.P.; Romero-Rodríguez, B.; Rosas-Díaz, T.; Cerero, L.; Rodríguez-Negrete, E.A.; Castillo, A.G.; Bejarano, E.R. Characterization of Curtovirus V2 protein, a functional homolog of Begomovirus V2. Front. Plant Sci. 2020, 11, 835. [Google Scholar] [CrossRef] [PubMed]
  48. Lacatus, G.; Sunter, G. The Arabidopsis PEAPOD2 transcription factor interacts with geminivirus AL2 protein and the coat protein promoter. Virology 2009, 392, 196–202. [Google Scholar] [CrossRef] [Green Version]
  49. Pagán, I.; Betancourt, M.; de Miguel, J.; Piñero, d.; Fraile, A.; García-Arenal, F. Genomic and biological characterization of chiltepín yellow mosaic virus, a new tymovirus infecting Capsicum annuum var. aviculare in Mexico. Arch. Virol. 2010, 155, 675–684. [Google Scholar] [CrossRef] [Green Version]
  50. Deom, C.M.; Brewer, M.T.; Severns, P.M. Positive selection and intrinsic disorder are associated with multifunctional C4(AC4) proteins and geminivirus diversification. Sci. Rep. 2021, 11, 11150. [Google Scholar] [CrossRef]
  51. Fondong, V.N. The ever-expanding role of C4/AC4 in Geminivirus infection: Punching above its weight? Mol. Plant 2019, 12, 145–147. [Google Scholar] [CrossRef] [Green Version]
  52. Martin, D.P.; Lefeuvre, P.; Varsani, A.; Hoareau, M.; Semegni, J.-Y.; Dijoux, B.; Vincent, C.; Reynaud, B.; Lett, J.-M. Complex recombination patterns arising during geminivirus coinfections preserve and demarcate biologically important intra-genome interaction networks. PLoS Pathog. 2011, 7, e1002203. [Google Scholar] [CrossRef] [Green Version]
  53. Simon-Loriere, E.; Holmes, E.C. Why do RNA viruses recombine? Nat. Rev. Microbiol. 2011, 9, 617–626. [Google Scholar] [CrossRef]
Figure 1. Genome organization of mono- and bipartite begomoviruses. Colored arrows denote the position and orientation of each gene. Monopartite and DNA-A of bipartite begomoviruses encode for: AC1: Replication initiator protein (Rep); AC2: Transcriptional activator protein (TrAP); AC3: Replication enhancer protein (Ren); AC4: Silencing suppressor; AV1: Coat protein (CP); and AV2: Various functions. In bipartite begomoviruses AV2 is only present in Old World (OW) begomoviruses, where AV1 is as long as in monopartite begomoviruses (dashed). DNA-B of bipartite begomoviruses encodes for: BC1: Movement protein (MP) and BC2: Nuclear shuttle protein (NSP). CR, common region. The hairpin which includes the origin of replication (ORI) is indicated in the Long Intergenic Region (LIR) (modified from [17]).
Figure 1. Genome organization of mono- and bipartite begomoviruses. Colored arrows denote the position and orientation of each gene. Monopartite and DNA-A of bipartite begomoviruses encode for: AC1: Replication initiator protein (Rep); AC2: Transcriptional activator protein (TrAP); AC3: Replication enhancer protein (Ren); AC4: Silencing suppressor; AV1: Coat protein (CP); and AV2: Various functions. In bipartite begomoviruses AV2 is only present in Old World (OW) begomoviruses, where AV1 is as long as in monopartite begomoviruses (dashed). DNA-B of bipartite begomoviruses encodes for: BC1: Movement protein (MP) and BC2: Nuclear shuttle protein (NSP). CR, common region. The hairpin which includes the origin of replication (ORI) is indicated in the Long Intergenic Region (LIR) (modified from [17]).
Microorganisms 10 00366 g001
Figure 2. Overlapping and nonoverlapping tree lengths (t) in overlapping genes. Blue dots denote genes in which t is significantly higher in NOV than in OV regions. Red dots denote genes showing the opposite trend. Genes with different types of overlapping (internal, 5′-, and 3′-terminal) are presented in different panels.
Figure 2. Overlapping and nonoverlapping tree lengths (t) in overlapping genes. Blue dots denote genes in which t is significantly higher in NOV than in OV regions. Red dots denote genes showing the opposite trend. Genes with different types of overlapping (internal, 5′-, and 3′-terminal) are presented in different panels.
Microorganisms 10 00366 g002
Figure 3. Overlapping and nonoverlapping dN/dS (upper line), dN (middle line) and dS (lower line) in overlapping genes. Blue dots denote instances in which parameters are significantly higher in NOV than in OV regions. Red dots denote genes showing the opposite trend. Genes with different types of overlapping (internal, 5′-, and 3′-terminal) are presented in different panels.
Figure 3. Overlapping and nonoverlapping dN/dS (upper line), dN (middle line) and dS (lower line) in overlapping genes. Blue dots denote instances in which parameters are significantly higher in NOV than in OV regions. Red dots denote genes showing the opposite trend. Genes with different types of overlapping (internal, 5′-, and 3′-terminal) are presented in different panels.
Microorganisms 10 00366 g003
Figure 4. Correlation between tree length (t/number of sequences) and the proportion of gene overlap (length of overlap/total gene length) for all types of overlap considered together (upper left), internal overlapping (upper right), 5′-terminal overlapping (lower left) and 3′-terminal overlapping (lower right).
Figure 4. Correlation between tree length (t/number of sequences) and the proportion of gene overlap (length of overlap/total gene length) for all types of overlap considered together (upper left), internal overlapping (upper right), 5′-terminal overlapping (lower left) and 3′-terminal overlapping (lower right).
Microorganisms 10 00366 g004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Martín-Hernández, I.; Pagán, I. Gene Overlapping as a Modulator of Begomovirus Evolution. Microorganisms 2022, 10, 366. https://doi.org/10.3390/microorganisms10020366

AMA Style

Martín-Hernández I, Pagán I. Gene Overlapping as a Modulator of Begomovirus Evolution. Microorganisms. 2022; 10(2):366. https://doi.org/10.3390/microorganisms10020366

Chicago/Turabian Style

Martín-Hernández, Iván, and Israel Pagán. 2022. "Gene Overlapping as a Modulator of Begomovirus Evolution" Microorganisms 10, no. 2: 366. https://doi.org/10.3390/microorganisms10020366

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop