Towards the Identification of Candidate Genes for Pollen Morphological Traits in Rubus L. Using Association Mapping
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDear authors,
This manuscript presents the identification of molecular markers associated with important phenotypical traits of Rubus L. In its current form it needs significant revision because it does not sufficiently disclose to readers the information about the markers you have obtained, and there is no specific literary interpretation of them. Please see my comments below.
I find the title of the manuscript too long. Please think about how it can be reformulated. For example: "Towards the Identification of Candidate Genes for Pollen Morphological Traits in Rubus L. Using Association Mapping". In this case, instead of the non-specific word "estimation" in the keywords, you can use "Rosaceae" instead.
The abstract and introduction are written on point. However, towards the end of the introduction it would be worth adding whether there is any information on molecular markers of raspberry pollen traits in the literature, to add context. Also, in the introduction, it is necessary to reflect information about why exactly these four features were chosen by you for study (length of the polar axis, length of the ectoaperture, distance between the apices of two ectocolpi, and equatorial diameter).
Materials and methods
2.1. How many plants were taken for each species? And how old were they?
2.2. "All theoretically possible marker observations were treated as genotype observations" - There is some kind of repetition here: "marker observations" - "genotype observations", please rephrase this sentence.
2.3. Line 122: "Fisher’s least significant differences" → Fisher’s least significant differences test;
In the current form of presentation, the results do not allow for a proper evaluation of the manuscript.
In particular, lines 185-191 provide very obvious conclusions about the correlations of the variables under study. Is this something that needs to be discussed? Was it previously unknown?
Table 2 needs to be moved to supplements.
Lines 230-231: The materials and methods do not indicate how exactly the association mapping procedure was carried out, so this proposal cannot be understood properly.
Question to Figure 4: Are the coordinates of the genomic markers shown on the X-axis? Did you make adjustments for multiple comparisons? With such a number of statistical tests, it is necessary to select only those markers that passed p.adjusted < 0.05 or even more strictly.
A significant problem with the last paragraph of section 3.2. (lines 258-272) is that you discuss the most important markers in the trait intersection, but the reader only sees their binary designations and ordinal numbers. It is necessary to make information about markers (their genomic coordinates, flanking sequences, etc.) available to readers in the form of a table in the supplements, otherwise this information will remain useless to them. Also, the reader knows nothing about their nature - neither the genomic position, nor the sequences, nor the type of marker. This needs to be clarified here. Or, this information can be presented in table number 4 and a detailed informative supplement can be prepared from it. In any case, I would advise you to remove Table 4 from the supplement and supplement it with this important information.
The discussion needs significant improvement: in its current version it largely repeats the introduction's thoughts on the importance of studying genetic markers, but it needs to focus on interpreting your results in the context of the literature and discussing specific markers in detail. In the current version only a few sentences are devoted to them (354-359).
The conclusion is also not sufficiently specific and general ideas from its beginning can be shortened.
Author Response
Response to Reviewer 1 Comments
Reviewer #1
Point 1: This manuscript presents the identification of molecular markers associated with important phenotypical traits of Rubus L. In its current form it needs significant revision because it does not sufficiently disclose to readers the information about the markers you have obtained, and there is no specific literary interpretation of them. Please see my comments below.
Response: We sincerely thank all of you for your comments. We appreciate their insight. We have revised the manuscript and have provided responses to each comment below.
Point 2: I find the title of the manuscript too long. Please think about how it can be reformulated. For example: "Towards the Identification of Candidate Genes for Pollen Morphological Traits in Rubus L. Using Association Mapping". In this case, instead of the non-specific word "estimation" in the keywords, you can use "Rosaceae" instead.
Response: We sincerely appreciate your feedback regarding the manuscript's title. We wanted to include all aspects relevant to our approach to analysis based on all possible markers. We agree that the title may be too long. Therefore, we appreciate your suggestion to change the title, which we agree with. We have changed the title to the suggested one: “Towards the Identification of Candidate Genes for Pollen Morphological Traits in Rubus L. Using Association Mapping”. We also changed the suggested keyword from "estimation" to "Rosaceae".
Point 3: The abstract and introduction are written on point. However, towards the end of the introduction it would be worth adding whether there is any information on molecular markers of raspberry pollen traits in the literature, to add context. Also, in the introduction, it is necessary to reflect information about why exactly these four features were chosen by you for study (length of the polar axis, length of the ectoaperture, distance between the apices of two ectocolpi, and equatorial diameter).
Response: We have added information in the introduction about research by other authors on the use of molecular markers on raspberry pollen traits. "There is information in the literature on molecular markers of raspberry pollen traits [30-32], but this is limited to specific marker observations. In this study, we present the possibility of analysis conducted on all theoretically possible markers.". These four specific traits were chosen for study because, according to the literature, they best characterize Rubus pollen. We have added this information in the Introduction of the manuscript.
Materials and methods
Point 4: 2.1. How many plants were taken for each species? And how old were they?
Response: The number of observations per species varied, ranging from 2 (for R. constrictus and R. opacus) to 5 (for R. bifrons, R. caesius, R. gracilis, R. henrici-egonis, R. idaeus, R. nessensis, R. plicatus, R. radula, R. saxatilis, R. sprengelii, and R. sulcatus) (Table 1). In each case, measurements were made from 30 mature, randomly selected, properly developed pollen grains using light microscopy (LM), measuring a total of 2100 pollen grains. We have added this information in the revised version of the manuscript.
Point 5: 2.2. "All theoretically possible marker observations were treated as genotype observations" - There is some kind of repetition here: "marker observations" - "genotype observations", please rephrase this sentence.
Response: Thank you for pointing out the repetition in the sentence. We have corrected it. The sentence now reads: "All theoretically possible marker were treated as genotype observations."
Point 6: 2.3. Line 122: "Fisher’s least significant differences" › Fisher’s least significant differences test;
Response: We have corrected “Fisher’s least significant differences” to “Fisher’s least significant differences (LSDs) test”.
Point 7: In the current form of presentation, the results do not allow for a proper evaluation of the manuscript.
Response: The results were presented according to a standard general-to-specific approach. First, we presented the results for phenotypic traits: the empirical distribution's conformity to the normal distribution, MANOVA, ANOVA, homogeneous groups, and multitrait clustering. Next, we presented genetic comparisons based on molecular marker observations. Finally, we presented the results of association mapping, which, on the one hand, specifically combined phenotypic and genotypic observations, and, on the other, identified markers linked to candidate genes determining individual observed traits. We believe this provides a clear presentation of the results. We have moved one table to the supplementary materials. We have also added information on population size at the beginning of the Results section.
Point 8: In particular, lines 185-191 provide very obvious conclusions about the correlations of the variables under study. Is this something that needs to be discussed? Was it previously unknown?
Response: We agree that the results regarding the correlation of the studied traits are known and provide obvious (most often) interdependencies. At least some authors report similar results. In this manuscript, we decided to discuss correlations because some authors suggest a connection between the correlation of traits and the determination of these traits by specific genes. This theory states that significantly correlated traits are determined by the same genes. Confirmation of this fact may mean that in future studies, it will be sufficient to conduct association mapping on only one of the groups of correlated traits. This will shorten and accelerate the research, and will also be more economically advantageous.
Point 9: Table 2 needs to be moved to supplements.
Response: We have moved Table 2 to the supplementary materials. It is now Table S1.
Point 10: Lines 230-231: The materials and methods do not indicate how exactly the association mapping procedure was carried out, so this proposal cannot be understood properly.
Response: Association mapping was conducted based on species mean trait values and the marker data, using a mixed linear model approach. This model incorporated population structure inferred via eigenanalysis and modeled as random effects. All statistical analyses were carried out the QSASSOCIATION procedure. This procedure implements a mixed-model marker–trait association analysis (also known as linkage disequilibrium mapping) for data obtained from a single-trait trial. To control for false-positive associations due to population structure and relatedness, the model included a genetic relatedness correction. The RELATIONSHIPMOD-EL=eigenanalysis option was used, which identifies major principal components from the marker matrix to account for population stratification. The scores of the significant principal components were incorporated as covariates in the MLM, providing an approximation of the genetic variance–covariance structure via a kinship matrix. The statistical significance of marker–trait associations was evaluated using p-values adjusted for multiple testing using the Benjamini–Hochberg false discovery rate correction method. It seems to us that this has been described in quite some detail.
Point 11: Question to Figure 4: Are the coordinates of the genomic markers shown on the X-axis? Did you make adjustments for multiple comparisons? With such a number of statistical tests, it is necessary to select only those markers that passed p.adjusted < 0.05 or even more strictly.
Response: The X-axis shows individual markers from 0000000000000001 (marker m[1]) to 111111111111110 (marker m[65534]). Due to the very large number of markers, including the 0X-axis category label would only obscure the image without providing any significant information. The most significant results are presented in Table 4. If the Reviewer and Editor so desire, we can provide results for all 65534 markers in the supplementary materials. For multiple comparisons, the Benjamini–Hochberg false discovery rate correction method was used. This information is included in the description of the methods used. Additionally, we have supplemented the caption under Figure 4 with the information: "The statistical significance of marker–trait associations was evaluated using p-values adjusted for multiple testing using the Benjamini–Hochberg false discovery rate correction method."
Point 12: A significant problem with the last paragraph of section 3.2. (lines 258-272) is that you discuss the most important markers in the trait intersection, but the reader only sees their binary designations and ordinal numbers. It is necessary to make information about markers (their genomic coordinates, flanking sequences, etc.) available to readers in the form of a table in the supplements, otherwise this information will remain useless to them. Also, the reader knows nothing about their nature - neither the genomic position, nor the sequences, nor the type of marker. This needs to be clarified here. Or, this information can be presented in table number 4 and a detailed informative supplement can be prepared from it. In any case, I would advise you to remove Table 4 from the supplement and supplement it with this important information.
Response: The essence of our research described in the manuscript is to consider all possible marker observations. Different, unique sets of zeros and ones. This situation is (unfortunately) unlikely to be obtained in real experiments. Unfortunately, because this can result in the loss of valuable information. How valuable? This non-stack cannot be addressed due to... the inability to obtain specific sets of zeros and one. In our research, we proposed a solution to this problem by considering the analysis on all possible marker observations. This was only possible by generating all combinations of zeros and ones. This is what we did. Therefore, in such a case, we cannot speak of information regarding genomic coordinates or flanking sequences. However, they are not needed because all the information is contained in the set of zeros and ones.
Point 13: The discussion needs significant improvement: in its current version it largely repeats the introduction's thoughts on the importance of studying genetic markers, but it needs to focus on interpreting your results in the context of the literature and discussing specific markers in detail. In the current version only a few sentences are devoted to them (354-359).
Response: Writing the Discussion was no easy task. This is due to the fact that the presented results are likely the first application of association mapping to Rubus pollen morphological traits in the literature. Therefore, the main emphasis in the Discussion is on the use/application of molecular markers in Rubus research. Due to our innovative approach, we were unable to directly compare our results with those of other researchers, as such studies do not exist. Of course, the importance of using association mapping for other species can be emphasized. We did not do this in our work to avoid introducing unnecessary information. If the Reviewer and/or Editor so desires, we will supplement the Discussion with a paragraph discussing the results of association mapping obtained by other researchers for other species.
Point 14: The conclusion is also not sufficiently specific and general ideas from its beginning can be shortened.
Response: We've improved the Conclusions by adding sentences with important information. We've also shortened the beginning of this chapter by removing two sentences.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsI have attached the manuscript with specific comments.
Comments for author File: Comments.pdf
Author Response
Response to Reviewer 2 Comments
Reviewer #2
Point 1: I have attached the manuscript with specific comments.
Response: We sincerely thank you for drawing our attention to the highlighted sections of the manuscript. We have included your responses below, adapting them to the appropriate places.
Point 2: Title.
Response: We changed the title to: “Towards the identification of candidate genes for pollen morphological traits in Rubus L. using association mapping.”
Point 3: Abstract.
Response: We have improved the Abstract by adding information on the uniqueness of the research conducted.
Point 4: Introduction.
Response: We have corrected the indicated sentence to: “Phenomena such as apomixis (asexual reproduction without fertilization), polyploidy (the presence of multiple sets of chromosomes), and frequent hybridization make traditional taxonomic methods based mainly on observations of external characters insufficient and may lead to erroneous conclusions [3].”.
Point 5: 2.1. Plant Material
Response: Data on the observation of four pollen traits were taken from the publication by Lechowicz et al. (2022). These four specific traits were chosen for study because, according to the literature, they best characterize Rubus pollen. We have added this information in the Introduction of the manuscript.
Point 6: 2.2. Genotypic Observations
Response: Genotype data were generated to obtain all possible different sets of 0s and 1s. This resulted in obtaining all theoretically observable markers, a situation that is (unfortunately) difficult to achieve in practice. This enabled analysis of the full set of genotype data. Due to the lengthy notation of the sixteen-character sets of 0s and 1s, we decided to use their decimal notation. Therefore, a method for converting these notations from the decimal system to binary and from binary to decimal is provided.
Point 7: 2.3. Statistical Methods
Response: Multivariate analysis of variance (MANOVA) and single-factor analyses of variance (ANOVA) were conducted to verify the general hypotheses regarding the absence of differences between species' mean values in terms of the observed traits. Association mapping was performed only when statistically significant differences were observed. Therefore, the statistical analysis began with MANOVA and ANOVA.
Point 8: Association mapping
Response: Association mapping was conducted based on species mean trait values and the marker data, using a mixed linear model approach. This model incorporated population structure inferred via eigenanalysis and modeled as random effects. All statistical analyses were carried out the QSASSOCIATION procedure. This procedure implements a mixed-model marker–trait association analysis (also known as linkage disequilibrium mapping) for data obtained from a single-trait trial. To control for false-positive associations due to population structure and relatedness, the model included a genetic relatedness correction. The RELATIONSHIPMOD-EL=eigenanalysis option was used, which identifies major principal components from the marker matrix to account for population stratification. The scores of the significant principal components were incorporated as covariates in the MLM, providing an approximation of the genetic variance–covariance structure via a kinship matrix. The statistical significance of marker–trait associations was evaluated using p-values adjusted for multiple testing using the Benjamini–Hochberg false discovery rate correction method. It seems to us that this has been described in quite some detail.
Point 9: 3. Results
Response: Thank you for pointing out the fragment. We have corrected it.
Point 10: Figure 1.
Response: The results presented in Figure 1 provide graphical confirmation of the normality of the empirical distributions of the four observed traits. Normality of distributions is one of the conditions for conducting analysis of variance and association mapping.
Point 11: Line 186: Pearson correlation coefficient of 0.99
Response: We agree that the results regarding the correlation of the studied traits are known and provide obvious (most often) interdependencies. At least some authors report similar results. In this manuscript, we decided to discuss correlations because some authors suggest a connection between the correlation of traits and the determination of these traits by specific genes. This theory states that significantly correlated traits are determined by the same genes. Confirmation of this fact may mean that in future studies, it will be sufficient to conduct association mapping on only one of the groups of correlated traits. This will shorten and accelerate the research, and will also be more economically advantageous.
Point 12: Lines 219-220: two canonical variates revealed a clear separation into four distinct groups.
Response: We recognize that divisions into other groups can be observed. Three larger groups or five smaller ones can be distinguished. However, considering over 53% of the variance explained by the first canonical variate, we distinguished four groups.
Point 13: Line 230: 8320 markers
Response: The statistical significance of marker–trait associations was evaluated using p-values adjusted for multiple testing using the Benjamini–Hochberg false discovery rate correction method. It seems to us that this has been described in quite some detail.
Point 14: 4. Discussion
Response: Writing the Discussion was no easy task. This is due to the fact that the presented results are likely the first application of association mapping to Rubus pollen morphological traits in the literature. Therefore, the main emphasis in the Discussion is on the use/application of molecular markers in Rubus research. Due to our innovative approach, we were unable to directly compare our results with those of other researchers, as such studies do not exist. Of course, the importance of using association mapping for other species can be emphasized. We did not do this in our work to avoid introducing unnecessary information. If the Reviewer and/or Editor so desires, we will supplement the Discussion with a paragraph discussing the results of association mapping obtained by other researchers for other species.
Point 15: 5. Conclusions
Response: We've improved the Conclusions by adding sentences with important information. We've also shortened the beginning of this chapter by removing two sentences.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsDear Authors,
Your manuscript has improved after revision.
However, many readers risk not understanding its main idea the first time and the innovativeness of your method, so I would advise you to explicitly indicate the species numbers you coded in Table 1 (even if they are ordinal). I also advise you to make a good graphic summary at the end of the manuscript - in which combinations of species researchers should look for unique sequences relative to the rest and to provide names of species for the top 10 markers.
Author Response
Response to Reviewer 1 Comments
Reviewer #1
Point 1: Your manuscript has improved after revision.
Response: Thank you very much.
Point 2: However, many readers risk not understanding its main idea the first time and the innovativeness of your method, so I would advise you to explicitly indicate the species numbers you coded in Table 1 (even if they are ordinal). I also advise you to make a good graphic summary at the end of the manuscript - in which combinations of species researchers should look for unique sequences relative to the rest and to provide names of species for the top 10 markers.
Response: We have added the order numbers of the individual species also found in Table A1. We have prepared a graphical diagram summarizing the principle of association mapping, with a focus on the Rubus species experiment under analysis. The diagram also includes the most important results obtained during the described experiment. We have also included the species names for the ten most important markers in the diagram. We have also included this information in the Abstract and Conclusions. We sincerely thank you again for your valuable comments. The current version of the manuscript has significantly improved its impact.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsFirstly, congratulations to the authors. The manuscript has improved considerably. However, there are some aspects that should be taken into account. I have attached them below.
Good luck!
Comments for author File: Comments.pdf
Author Response
Response to Reviewer 2 Comments
Reviewer #2
Point 1: Firstly, congratulations to the authors. The manuscript has improved considerably. However, there are some aspects that should be taken into account. I have attached them below. Good luck!
Response: We sincerely thank you for appreciating our revisions. We also thank you again for your comments, which have been highlighted in the new version of the manuscript. Below, we have provided responses to each comment, adapting them to the appropriate places.
Point 2: 2.1. Plant Material.
Response: We have added the full names of the species and the ordinal numbers of these species found in Table A1. We have also added information on the significance of four pollen morphological features considered in the manuscript: “These four specific traits were chosen for study because, according to the literature, they best characterize Rubus pollen.”
Point 3: 2.3. Statistical Methods.
Response: We have corrected the highlighted section regarding the use of multivariate analysis of variance and one-way analysis of variance. The corrected sentence reads: "A multivariate analysis of variance (MANOVA) was carried out to determine the effects of species for all four the observed traits jointly. Nextly, one-way analyses of variance (ANOVA)".
Point 4: 5. Conclusions.
Response: We have corrected the highlighted passage. In the revised manuscript, the sentence appears as follows: "Notably, six markers m[5248] = 0001010010000000, m[4224] = 0001000010000000, m[5249] = 0001010010000001, m[60287] = 1110101101111111, m[61311] = 1110111101111111, and m[60286] = 1110101101111110 exhibited particularly high cumulative LOD scores, reflecting their strong discriminatory power."
Author Response File: Author Response.pdf