Next Article in Journal
Contribution of Putrescine and Glutamic Acid on γ-Aminobutyric Acid Accumulation of Malus baccata Borkh. Roots under Suboptimal Low Root-Zone Temperature
Previous Article in Journal
YOLO-C: An Efficient and Robust Detection Algorithm for Mature Long Staple Cotton Targets with High-Resolution RGB Images
Previous Article in Special Issue
A Novel Isolate of Bean Common Mosaic Virus Isolated from Crownvetch (Securigera varia L. Lassen)
 
 
Article
Peer-Review Record

Probing the RNA Structure of a Satellite RNA of Cucumber Mosaic Virus Using SHAPE Method

Agronomy 2023, 13(8), 1990; https://doi.org/10.3390/agronomy13081990
by Zhifei Liu 1,†, Xinran Cao 2,3,4,*,†, Chengming Yu 1 and Xuefeng Yuan 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Agronomy 2023, 13(8), 1990; https://doi.org/10.3390/agronomy13081990
Submission received: 15 June 2023 / Revised: 12 July 2023 / Accepted: 26 July 2023 / Published: 27 July 2023
(This article belongs to the Special Issue Molecular Evolution of Plant RNA Viruses)

Round 1

Reviewer 1 Report

The authors of this manuscript have used the SHAPE method in combination with mfold secondary structure software to create a model for the secondary structure of a satellite RNA of CMV. The SHAPE method involves modification of unpaired 2’ hydroxyls of the ribose sugar in unpaired nucleotides in an RNA structure. Earlier techniques have involved other modifications of the base instead, but all these approaches give the same effect on cDNA synthesis on the modified template – they cause the synthesis to stop. By matching the position of the stops with that of the same primer used on unmodified RNA in a sequencing reaction, the location of the stops can be determined. Prior to this study, several methods including chemical modification procedures were used to determine the secondary structure of at least 10 CMV satellite RNAs (refs. 21, 31, 33, 34 and references therein). In one case, SHAPE was used to determine the sequences of a CMV satellite RNA that affected the accumulation of CMV (ref. 21). Thus, while each analysis shows that there are common areas of structures, they also show differences in other domains. All but one of these studies has been solely limited to in vitro analysis of the satellite RNA structure; the exception (ref. 33) also examined the structure in planta and in isolated virions, with much commonality, but also differences. Therefore, except for studies in which mutagenesis has been done to substantiate structures associated with biological phenotypes (ref. 21), none of the in vitro structures alone can lead to an understanding of what happens in vivo, where the satellite RNAs may be associated with host proteins affecting their structure. Therefore, the expectations of the authors of this study are rather too optimistic, given that except for the work in ref. 21, knowledge of earlier structures (work done up to 1997) has not led to any understanding the role of CMV satellite RNA structure in any biological property.

The sole justification then for this study is to determine the structure of a CMV satellite RNA that conforms to the third group of CMV satellite RNAs referred to as “larger CMV satellite RNAs”; i.e., longer than the usual 332-342 nt size range, which formed two groups by sequence phylogeny, one being necrogenic satellite RNAs and the other being non-necrogenic ones, whereas the larger CMV satellite RNAs, ranging from 368-405 nt, formed the third group (Fraile & Garcia-Arenal, 1991. J. Mol. Biol. 221: 1065-1069; Garcia-Arenal & Palukaitis, 1999. Curr. Top. Microbiol. Immunol. 239: 37-63). Thus far, no actual larger CMV satellite RNA has had its structure determined using chemical modifications. Rather, the structures shown were determined with either limited enzymatic cleavage mapped onto previous structures, or just in scripto determined structures (Ref 35 and Hidaka et al., 1988. Virology 164: 326-333). Thus, this is the unique feature of this study.

Here, there are two fundamental problems with the work presented. The first problem relates to the secondary structure itself (Figure 4). The structure must be drawn properly with bonds at proper distances (or close to them) and not stretched to impossible distances, as shown in the figure.

Example 1: ss2, ss3 and (to a lesser extent) ss4 all must be collapsed so that the adjacent stems are closer to each other. Then, the various stems of SL2-SL5 will all be much closer to each other.

Example 2: Since there is no single-stranded region connecting SL5 to SL6, the two Cs adjacent to each other at the bottom of the two stems must not have a space between them, just one unseen phosphodiester group linking the two Cs.

Example 3: BL3-1 is not a bulge loop, since opposite it are no nucleotides, but rather two adjacent stems (C77 and C76). Bringing them together would pull over stem-loop hL3-1 and generate a loop opposite it, but 3 nt loops are not stable, and thus one bp on either side of the loop (G89-U90-U91) would have to open to accommodate a 4 bp loop.

Example 4: The phosphodiester bonds opposite bulges mbL3-2 and mbL3-3 also cannot exist as shown but would have to be collapsed so that nucleotides C100 and G101 are adjacent to each other, pulling over hL3-2, whereas collapsing the space shown between G128 and C129 would pull over hL3-3 and hL3-4, as well as allow nucleotides G202, C203 and U204 to collapse to normal bond distances between the two flanking stems. These necessary changes would produce a more accurate structure.

The second fundamental problem is the phylogenetic tree analysis. For reasons of space, no one wants to take more than 100 CMV satellite RNAs and place them in one tree. Hence, in this study, the authors chose various CMV satellite RNAs at random to place in the tree. The problem is that the tree is labeled very differently from the tree the same laboratory published in Liu et al. 2018. Acta Phytopathologica Sinica 48: 365-372. In addition, some isolates are incorrectly labeled and (at least) one is incorrectly placed (see below). The authors have an opportunity here to do this correctly and to modify what has previously been stated on this topic, rather than ignore it, as they done on both papers; viz., that the larger CMV satellite RNAs also fall into two groups, one more closely related to the other two groups previously identified and one much further away.

After these corrections have been made to the manuscript, it should make a fine addition to the satellite RNA literature.

Other comments:

1. The authors should also remind the readers that mfold seeks to bring the two ends together and most of the stable structures it generates do so, as was seen in Supplementary Figure 2. In addition, the structures are based on those the most stable free energy values under conditions that are not physiological for plants, or animals for that matter.

2. In Figure 6B, the authors should determine what the structures of these other SL1 and SL2 for each of the four other satellites look like if their corresponding sequences are superimposed on each of the five best structures for satellite TA-Tb. They may find a consensus structure with some extra bulges, which may not have the lowest free energy, but could allow various interactions to occur.

3. In the 2018 paper by these authors, they grouped the satellite RNAs into three subgroups (I-III) and a fourth one with no subgroup designation. In this manuscript, they have taken the original subgroups I and II and renamed them Groups I-I. The original three-member Subgroup III has been renamed Group I-II (although only one of the three members shown is the same as one in Subgroup III in the 2018 paper), and now named the last subgroup from the 2018 study, as Group II. This seems completely arbitrary with no explanation. In fact, it appears that when considering the population of the new Group I-II and the old Subgroup III, they are a mix of isolates that are 368-nt isolates with D28559 being a 405-nt isolate. By contrast, the various isolates grouped in this study into Group II as those containing 383-390 nt. Since the KN-satellite RNA isolate was shown to contain two insertion regions (relative to the D-satellite RNA (or CARNA5) (Ref. 35), designated IS-1 and IS-2, with Y-satellite RNA (368 nt) having only IS-2 inserted between nucleotides 146 and 147, and 57-satellite RNA (also 368 nt) having only IS-1 inserted between nucleotides 80-87, whereas KN-satellite RNA had both inserts, this raises some interesting questions that have not addressed since those earlier data were published. With the analyses done here, these questions can be addressed. First, do those 368-nt satellite RNAs fall into two groups (Groups I-II and II) depending on which insertion (IS-1 or IS-2) is present, with one type also containing the largest (405-nt) KN-satellite RNA and the other containing the 383-390-nt satellite RNAs? Second, are there other features (insertions) that could affect the phylogeny and the secondary RNA structure that differentiate Groups I-II and II? The authors have the opportunity here to answer these questions. They can also address the question of whether the original biological designation of Groups I and II (332-324-nt satellite RNAs based on necrogenic vs. non-necrogenic satellite RNAs) still holds true, is such biology has been recorded (not always available, if the only publication is the sequence in the GenBank).

4. In this study, one satellite RNA is located in a completely different neighborhood. The isolate “banana (U43889)” was located in the lower half of Group I-I (or what was called Subgroup II in the 2018 study). However, in the 2018 study, “U43889” (with no other information) was located in Subgroup I, close to “MF142364 TA-ra China”, whereas in this study, “TA-Ra” was located in the upper half of Group I-I, nowhere near “U43889”. On the other hand, the nearest neighbor of “TA-Ra” in this study is “XJs2 DQ070747”, whereas in the 2018 study, “DQ070747 China” was located in Subgroup II, between “D00541 England” and “D00699 Japan”, which is also where “banana (U43889)” is located in this study! So, which of these isolates are in the wrong place?

5. In the current study, the GenBank number given for “TA-Ra” is “KY646100”, which is not the same as the number given in the 2018 paper. A search shows that MF142364 is the correct number for TA-ra satellite RNA, whereas KY646100 is the GenBank number for the TA-ca RNA 3 partial MP gene! It is probably best to confirm all GenBank numbers listed with their actual sequences.

Author Response

Responses to Reviewer 1

 

The authors of this manuscript have used the SHAPE method in combination with mfold secondary structure software to create a model for the secondary structure of a satellite RNA of CMV. The SHAPE method involves modification of unpaired 2’ hydroxyls of the ribose sugar in unpaired nucleotides in an RNA structure. Earlier techniques have involved other modifications of the base instead, but all these approaches give the same effect on cDNA synthesis on the modified template – they cause the synthesis to stop. By matching the position of the stops with that of the same primer used on unmodified RNA in a sequencing reaction, the location of the stops can be determined. Prior to this study, several methods including chemical modification procedures were used to determine the secondary structure of at least 10 CMV satellite RNAs (refs. 21, 31, 33, 34 and references therein). In one case, SHAPE was used to determine the sequences of a CMV satellite RNA that affected the accumulation of CMV (ref. 21). Thus, while each analysis shows that there are common areas of structures, they also show differences in other domains. All but one of these studies has been solely limited to in vitro analysis of the satellite RNA structure; the exception (ref. 33) also examined the structure in planta and in isolated virions, with much commonality, but also differences. Therefore, except for studies in which mutagenesis has been done to substantiate structures associated with biological phenotypes (ref. 21), none of the in vitro structures alone can lead to an understanding of what happens in vivo, where the satellite RNAs may be associated with host proteins affecting their structure. Therefore, the expectations of the authors of this study are rather too optimistic, given that except for the work in ref. 21, knowledge of earlier structures (work done up to 1997) has not led to any understanding the role of CMV satellite RNA structure in any biological property.

 

The sole justification then for this study is to determine the structure of a CMV satellite RNA that conforms to the third group of CMV satellite RNAs referred to as “larger CMV satellite RNAs”; i.e., longer than the usual 332-342 nt size range, which formed two groups by sequence phylogeny, one being necrogenic satellite RNAs and the other being non-necrogenic ones, whereas the larger CMV satellite RNAs, ranging from 368-405 nt, formed the third group (Fraile & Garcia-Arenal, 1991. J. Mol. Biol. 221: 1065-1069; Garcia-Arenal & Palukaitis, 1999. Curr. Top. Microbiol. Immunol. 239: 37-63). Thus far, no actual larger CMV satellite RNA has had its structure determined using chemical modifications. Rather, the structures shown were determined with either limited enzymatic cleavage mapped onto previous structures, or just in scripto determined structures (Ref 35 and Hidaka et al., 1988. Virology 164: 326-333). Thus, this is the unique feature of this study.

 

Answer: Thank you very much for your positive comments and valuable suggestions on our work. We are very grateful that our manuscript has been reviewed by a reviewer with the high knowledge and expertise on this particular subject of research, so that we can improve the quality of our article. In relation with the results presented in this manuscript, currently, our research group is investigating the roles of RNA structures for the biological properties of satCMV TA-Tb. Indeed, we have made significant progress in elucidating this objective and plant to publish our results in near future.

.Here, there are two fundamental problems with the work presented. The first problem relates to the secondary structure itself (Figure 4). The structure must be drawn properly with bonds at proper distances (or close to them) and not stretched to impossible distances, as shown in the figure.

 

Example 1: ss2, ss3 and (to a lesser extent) ss4 all must be collapsed so that the adjacent stems are closer to each other. Then, the various stems of SL2-SL5 will all be much closer to each other.

 

Example 2: Since there is no single-stranded region connecting SL5 to SL6, the two Cs adjacent to each other at the bottom of the two stems must not have a space between them, just one unseen phosphodiester group linking the two Cs.

 

Example 3: BL3-1 is not a bulge loop, since opposite it are no nucleotides, but rather two adjacent stems (C77 and C76). Bringing them together would pull over stem-loop hL3-1 and generate a loop opposite it, but 3 nt loops are not stable, and thus one bp on either side of the loop (G89-U90-U91) would have to open to accommodate a 4 bp loop.

 

Example 4: The phosphodiester bonds opposite bulges mbL3-2 and mbL3-3 also cannot exist as shown but would have to be collapsed so that nucleotides C100 and G101 are adjacent to each other, pulling over hL3-2, whereas collapsing the space shown between G128 and C129 would pull over hL3-3 and hL3-4, as well as allow nucleotides G202, C203 and U204 to collapse to normal bond distances between the two flanking stems. These necessary changes would produce a more accurate structure.

 

Answer: We have largely modified the RNA structure presented in Figure 4 according to your suggestions. In particular, nucleotide bonds in single-stranded region now drawn at proper distances and not overly stretched.

 

The second fundamental problem is the phylogenetic tree analysis. For reasons of space, no one wants to take more than 100 CMV satellite RNAs and place them in one tree. Hence, in this study, the authors chose various CMV satellite RNAs at random to place in the tree. The problem is that the tree is labeled very differently from the tree the same laboratory published in Liu et al. 2018. Acta Phytopathologica Sinica 48: 365-372. In addition, some isolates are incorrectly labeled and (at least) one is incorrectly placed (see below). The authors have an opportunity here to do this correctly and to modify what has previously been stated on this topic, rather than ignore it, as they done on both papers; viz., that the larger CMV satellite RNAs also fall into two groups, one more closely related to the other two groups previously identified and one much further away.

 

Answer: As also suggested by Reviewer 2, we have reanalyzed the sequence data and generated the phylogenetic tree using maximum-likelihood method. We embarrassingly admit that there was incorrect labelling of sequences in phylogenetic tree presented in our previous paper (Liu et al., 2018).

After these corrections have been made to the manuscript, it should make a fine addition to the satellite RNA literature.

 

Other comments:

 

  1. The authors should also remind the readers that mfold seeks to bring the two ends together and most of the stable structures it generates do so, as was seen in Supplementary Figure 2. In addition, the structures are based on those the most stable free energy values under conditions that are not physiological for plants, or animals for that matter.

 

Answer: This information has been added to the manuscript. Line 264-267

 

  1. In Figure 6B, the authors should determine what the structures of these other SL1 and SL2 for each of the four other satellites look like if their corresponding sequences are superimposed on each of the five best structures for satellite TA-Tb. They may find a consensus structure with some extra bulges, which may not have the lowest free energy, but could allow various interactions to occur.

 

Answer: In Figure 6B, formation of SL1 and SL2 in four other satellites were predicted using partial sequences corresponding to the SL1 and SL2 sequences in satellite TA-Tb. Prediction of RNA structures using complete sequence of these four isolates was not done because no supporting RNA structure data from SHAPE or other analyses is available at this point. Thus, the prediction of SL1 and SL2 structures in other satellites closely related to TA-Tb only imply the presence of such structures based on the sequence conservation in the 5’-proximal region. Regarding your suggestion to superimpose SL1 and SL2 to the five best structures for satellite TA-Tb, we consider that this kind of analysis is quite complex and beyond our competency and the scope of this study.

 

  1. In the 2018 paper by these authors, they grouped the satellite RNAs into three subgroups (I-III) and a fourth one with no subgroup designation. In this manuscript, they have taken the original subgroups I and II and renamed them Groups I-I. The original three-member Subgroup III has been renamed Group I-II (although only one of the three members shown is the same as one in Subgroup III in the 2018 paper), and now named the last subgroup from the 2018 study, as Group II. This seems completely arbitrary with no explanation. In fact, it appears that when considering the population of the new Group I-II and the old Subgroup III, they are a mix of isolates that are 368-nt isolates with D28559 being a 405-nt isolate. By contrast, the various isolates grouped in this study into Group II as those containing 383-390 nt. Since the KN-satellite RNA isolate was shown to contain two insertion regions (relative to the D-satellite RNA (or CARNA5) (Ref. 35), designated IS-1 and IS-2, with Y-satellite RNA (368 nt) having only IS-2 inserted between nucleotides 146 and 147, and 57-satellite RNA (also 368 nt) having only IS-1 inserted between nucleotides 80-87, whereas KN-satellite RNA had both inserts, this raises some interesting questions that have not addressed since those earlier data were published. With the analyses done here, these questions can be addressed. First, do those 368-nt satellite RNAs fall into two groups (Groups I-II and II) depending on which insertion (IS-1 or IS-2) is present, with one type also containing the largest (405-nt) KN-satellite RNA and the other containing the 383-390-nt satellite RNAs? Second, are there other features (insertions) that could affect the phylogeny and the secondary RNA structure that differentiate Groups I-II and II? The authors have the opportunity here to answer these questions. They can also address the question of whether the original biological designation of Groups I and II (332-324-nt satellite RNAs based on necrogenic vs. non-necrogenic satellite RNAs) still holds true, is such biology has been recorded (not always available, if the only publication is the sequence in the GenBank).

 

Answer: Thank you very much for pointing out these important questions. Our new revised phylogenetic tree presented in Figure 4 and the corresponding descriptions in the manuscripts (Line 313-324) have now addressed these questions.

 

  1. In this study, one satellite RNA is located in a completely different neighborhood. The isolate “banana (U43889)” was located in the lower half of Group I-I (or what was called Subgroup II in the 2018 study). However, in the 2018 study, “U43889” (with no other information) was located in Subgroup I, close to “MF142364 TA-ra China”, whereas in this study, “TA-Ra” was located in the upper half of Group I-I, nowhere near “U43889”. On the other hand, the nearest neighbor of “TA-Ra” in this study is “XJs2 DQ070747”, whereas in the 2018 study, “DQ070747 China” was located in Subgroup II, between “D00541 England” and “D00699 Japan”, which is also where “banana (U43889)” is located in this study! So, which of these isolates are in the wrong place?

 

Answer: We have confirmed that there was incorrect labelling of satCMV sequences in the phylogenetic tree in our 2018 paper. In the revised phylogenetic tree presented in Figure 4, “banana U43889” and “XJs2 DQ070747” are placed in the correct location.

 

  1. In the current study, the GenBank number given for “TA-Ra” is “KY646100”, which is not the same as the number given in the 2018 paper. A search shows that MF142364 is the correct number for TA-ra satellite RNA, whereas KY646100 is the GenBank number for the TA-ca RNA 3 partial MP gene! It is probably best to confirm all GenBank numbers listed with their actual sequences.

 

Answer: This was our mistake. We have corrected the accession numbers.

 

 

 

 

 

 

 

 

 

 

Reviewer 2 Report

The current study is critical to understand how to optimize protocols to deduce viral RNA structures correctly. Much scope remains to improve the significance of this study. The viral RNA structures in the field of agriculture are still in a nascent stage and thus, require much thorough and robust discussion to understand the impact of this study.

Major critical comment:

- I would kindly request authors to make maximum likelihood trees which is much more accurate and reliable to conclude than the neighbor end joining tree and also allows a correct estimate of the evolutionary distance. In the phylogenetic trees, bootstrap values should be indicated which are more than 70% to show the confidence of the nodes. Seems like the tree was mid-point rooted. It seems using an outgroup would improve the confidence of the tree. The corresponding region of the helper virus can be used as an outgroup here. In the material and methods, it was mentioned that the sequences were aligned. Were the sequences trimmed also which is critical to make the phylogenetic tree?

- abstract lines 27 and 28; introduction lines 32 to 43; Discussion: lines 386 to 389:  As the authors pointed out RNA structure regulates pathogenicity and replication, more studies should be highlighted in the discussion to show RNA structures deduced from SHAPE are most of the time biologically relevant, particularly when studied non-coding or untranslated RNA. When SHAPE is combined with nucleotide conservation in multiple sequences and a phylogenetic tree, the accuracy of biological relevance increases. As this is a preliminary study, I would request authors highlight this which underlies the importance of this study, and cite relevant papers.  https://doi.org/10.3390/v15030638; https://doi.org/10.1093/nar/gkv917

Minor comments

- line 324: the scale bar should indicate here "nucleotide substitutions per site" instead of aminoacid distances. Kindly check this.

- line 127: modified RNA should be replaced with "both treated RNA(modified and control RNA)

- line 176: were calculated

 

The English language is well-written and easy to understand.

Author Response

Responses to Reviewer 2

 

The current study is critical to understand how to optimize protocols to deduce viral RNA structures correctly. Much scope remains to improve the significance of this study. The viral RNA structures in the field of agriculture are still in a nascent stage and thus, require much thorough and robust discussion to understand the impact of this study.

 

Answer: Thank you very much for your valuable suggestions to improve our manuscripts.

 

Major critical comment:

 

- I would kindly request authors to make maximum likelihood trees which is much more accurate and reliable to conclude than the neighbor end joining tree and also allows a correct estimate of the evolutionary distance. In the phylogenetic trees, bootstrap values should be indicated which are more than 70% to show the confidence of the nodes. Seems like the tree was mid-point rooted. It seems using an outgroup would improve the confidence of the tree. The corresponding region of the helper virus can be used as an outgroup here. In the material and methods, it was mentioned that the sequences were aligned. Were the sequences trimmed also which is critical to make the phylogenetic tree?

 

Answer: We have reconstructed the phylogenetic tree using maximum likelihood method. We have added partial sequence of CMV RNA3 as an outgroup. The software used for trimming of poorly reliable regions in the alignments is described in method section (Line 176-181).

 

- abstract lines 27 and 28; introduction lines 32 to 43; Discussion: lines 386 to 389:  As the authors pointed out RNA structure regulates pathogenicity and replication, more studies should be highlighted in the discussion to show RNA structures deduced from SHAPE are most of the time biologically relevant, particularly when studied non-coding or untranslated RNA. When SHAPE is combined with nucleotide conservation in multiple sequences and a phylogenetic tree, the accuracy of biological relevance increases. As this is a preliminary study, I would request authors highlight this which underlies the importance of this study, and cite relevant papers.  https://doi.org/10.3390/v15030638; https://doi.org/10.1093/nar/gkv917

 

Answer: We have added the discussion regarding this matter and the corresponding references. Line 403-415

 

Minor comments

 

- line 324: the scale bar should indicate here "nucleotide substitutions per site" instead of amino acid distances. Kindly check this.

 

Answer: The texts have been changed. Line 340-341

 

- line 127: modified RNA should be replaced with "both treated RNA(modified and control RNA)

 

Answer: The texts have been changed. Line 129-130

 

- line 176: were calculated

 

Answer: The texts have been changed. Line 176

Round 2

Reviewer 1 Report

I am satisfied with the explanations and changes made to the manuscript.

Reviewer 2 Report

The authors have improved the discussions and re-did some analyses mentioned earlier. Currently, the manuscript is ready for publication without any further reservations.

Back to TopTop