Next Article in Journal
AQI Prediction Based on CEEMDAN-ARMA-LSTM
Previous Article in Journal
Sustainability Model for the Internet of Health Things (IoHT) Using Reinforcement Learning with Mobile Edge Secured Services
 
 
Article
Peer-Review Record

Growing Stock Volume Estimation for Daiyun Mountain Reserve Based on Multiple Linear Regression and Machine Learning

Sustainability 2022, 14(19), 12187; https://doi.org/10.3390/su141912187
by Jinhuang Wei and Zhongmou Fan *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sustainability 2022, 14(19), 12187; https://doi.org/10.3390/su141912187
Submission received: 25 August 2022 / Revised: 20 September 2022 / Accepted: 23 September 2022 / Published: 26 September 2022
(This article belongs to the Section Sustainable Forestry)

Round 1

Reviewer 1 Report (Previous Reviewer 1)

All the comments are same as shared earlier. The manuscript uploaded in the system is not the revised one. I think the authors have not revised their manuscript by following the reviewers comments.

Author Response

Please accept my sincere apologies. The last manuscript uploaded to the system is not the revised version. My revised manuscript is now uploaded in "Microsoft Word template". Your comments and those of other reviewers have also been revised and marked up. Thanks for reviewing this manuscript and providing your comments. Based on your comments, I have revised the following.

Point 1: The novelty of this work is not clear, it must be clearly mentioned in the abstract.

Response 1: The abstract has been revised. There is more novelty in the abstract, for example, "By adding the measured data to the model, we can effectively overcome saturation to some extent, and improve the fitting performance of all models."

 

Point 2: Abstract of this article is not adequate, e.g. the sentences like ‘It can be seen from the map that…’ which refers the reader to other sections/maps must be avoided. The whole abstract must be re-written. It must include briefly the research problem and findings of the study which reflects clearly the novelty of your work.

Response 2: Modifications have been made to the above incorrect sentences. In addition, the research questions and results have been simplified

 

Point 3: In machine learning, especially in the classification model, the accuracy could also be high due to class imbalance in the dataset. Could the authors provide more information on the distribution of classes by providing information about confusion matrix? In case there is significant class imbalance, it might also aid their analysis if they perform under/over sampling or selectively penalize misclassification of the minority class.

Response 3: The confusion matrix is not available due to the use of a machine learning regression model instead of a classification model. Model accuracy was verified using R2, RMSE, and rRMSE in the manuscript.

 

Point 4: In the methods section, normally, the experimental procedure/methodology must be comprehensive and detailed enough to understand by a novice and to be easily reproduced by other researchers. It is suggested to describe the procedure adopted, especially in regression/machine learning section, in a further detailed and clear way.

Response 4: To make the experiment process and method more detailed, the content about '2.3.5 Selection of hyperparameters for machine learning models' has been enhanced in this manuscript revision..

 

Point 5: It will further enhance the quality of this work if the current results are compared with some more latest relevant studies.

Response 5: A literature review has been included in the Discussion section comparing the environmental factors selected for GSV with recent relevant studies. In addition, a review of the literature regarding the accuracy of GSV estimates has been added

 

Point 6: The English write-up of this paper is weak throughout the manuscript.  It must be improved thoroughly.

Response 6: The manuscript has been submitted to a professional English-language retouching agency for two rounds of retouching to make it more natural to write in English.

Author Response File: Author Response.pdf

Reviewer 2 Report (Previous Reviewer 2)

1. Please add section/s in the "Introduction" about the 3 machine learning (ML) approaches (i.e., decision tree, random forest, extra trees) used. Provide a general description of the advantages or disadvantages of using these approaches. How these ML approaches have been used before in the literature to estimate GSV or any similar cases? Also, provide a justification for why these approaches were selected instead of other techniques such as support vector machine, gradient boosting, etc.).

2. Please also add a section for the literature review of the different environmental factors that were chosen as predictors of GSV. What does the literature say about these factors with respect to GSV? This will help justify why these factors are selected and might help the authors identify additional factors that can be included to help improve the GSV estimation model.

3. Figure 1. Please add insets showing the location of the study site within Fujian Province as well as Fujian Province with respect to China. Also, kindly cite Figure 1 in the manuscript.

4. In the Methodology section, kindly elaborate on the different ML used in the study. For example, for Decision Tree, how is implemented? What parameters are used to run the model (i.e., the minimum number of observations in parent/child nodes, strategy used to choose the split at each node, etc.)? Please do a similar discussion with respect to random forests, and extra trees.

 5. Kindly combine Figures 3 to 7 in one figure for input and just label letters (i.e., a to e).

6. Kindly improve Figure 8 to be more legible.

7.  Kindly add a section on the "Results" providing a descriptive discussion of the input data. For example, how variable is the altitude of the study area, and so on?

8. In the Discussion section, please discuss and explain your results. For example, how "the inclusion of measured data in the model effectively mitigated the saturation caused by high-density natural forests to some extent?" Also, why do the models even the ML ones have a quite low R-squared? What are the potential reasons for such a result? Lastly, how does the ML approach in this study performed compared with similar studies on estimating GSV?

9. In the Discussion/Conclusion section, please add a paragraph on the limitation of the study. How do such limitations affect the result? Is it the cause of low R-squared? What should be done in future research to address these limitations?

Author Response

Please accept my sincere apologies. The last manuscript uploaded to the system is not the revised version. My revised manuscript is now uploaded in "Microsoft Word". Your comments and those of other reviewers have also been revised and marked up. Thanks for reviewing this manuscript and providing your comments. Based on your comments, I have revised the following.

 

Point 1: Please add section/s in the "Introduction" about the 3 machine learning (ML) approaches (i.e., decision tree, random forest, extra trees) used. Provide a general description of the advantages or disadvantages of using these approaches. How these ML approaches have been used before in the literature to estimate GSV or any similar cases? Also, provide a justification for why these approaches were selected instead of other techniques such as support vector machine, gradient boosting, etc.).

Response 1: In the introduction, the advantages and disadvantages of each machine learning method are described and labeled, along with a literature review. A decision tree model is typical enough to be included, and we believe that additional trees can reduce the overfitting of a decision tree model. Support vector machines, gradient boosting, etc. are not included, as we believe that decision trees and random forests are typical enough. Therefore, these three models have been selected as the research methods. We will consider adding more machine learning models in future research

 

Point 2: Please also add a section for the literature review of the different environmental factors that were chosen as predictors of GSV. What does the literature say about these factors with respect to GSV? This will help justify why these factors are selected and might help the authors identify additional factors that can be included to help improve the GSV estimation model.

Response 2:  A literature review of the environmental factors selected as GSV in comparison with recent relevant studies has been added to the Discussion section.

 

Point 3: Figure 1. Please add insets showing the location of the study site within Fujian Province as well as Fujian Province with respect to China. Also, kindly cite Figure 1 in the manuscript.

Response 3:  According to your requirements, Figure1 has been modified.    

 

Point 4: In the Methodology section, kindly elaborate on the different ML used in the study. For example, for Decision Tree, how is implemented? What parameters are used to run the model (i.e., the minimum number of observations in parent/child nodes, strategy used to choose the split at each node, etc.)? Please do a similar discussion with respect to random forests, and extra trees.

Response 4: To make the experiment process and method more detailed, the content about '2.3.5 Selection of hyperparameters for machine learning models' has been supplemented.

 

Point 5: Kindly combine Figures 3 to 7 in one figure for input and just label letters (i.e., a to e).

Response 5: According to your requirements, Figure3 to Figure7 have been combined into Figure4 and supplemented with Figure5.

 

Point 6: Kindly improve Figure 8 to be more legible.

Response 6: A higher resolution version of Figure8 has been added

 

Point 7:  Kindly add a section on the "Results" providing a descriptive discussion of the input data. For example, how variable is the altitude of the study area, and so on?

Response 7: The "Results" section has been updated with a description of the input data.

 

Point 8: In the Discussion section, please discuss and explain your results. For example, how "the inclusion of measured data in the model effectively mitigated the saturation caused by high-density natural forests to some extent?" Also, why do the models even the ML ones have a quite low R-squared? What are the potential reasons for such a result? Lastly, how does the ML approach in this study performed compared with similar studies on estimating GSV?

Response 8: "The inclusion of measured data in the model effectively mitigated the saturation caused by high-density natural forests to some extent?" is explained in the 'Discussion' section. The last paragraph of this section explains why R2 is low. A literature review comparing GSV estimates with similar studies has been added at the beginning of the discussion    

 

Point 9: In the Discussion/Conclusion section, please add a paragraph on the limitation of the study. How do such limitations affect the result? Is it the cause of low R-squared? What should be done in future research to address these limitations?

Response 9: The last paragraph of the "Discussion" section describes the limitations of the study and how it can be improved in future studies.

Author Response File: Author Response.pdf

Reviewer 3 Report (Previous Reviewer 3)

Dear Authors and Editors,

The content of the paper is good and the quality of the research is good. But the way the paper has been written is not the way to write a manuscript. I suggest getting it written by an expert or a native English speaker before resubmission.

Author Response

Point 1: The content of the paper is good and the quality of the research is good. But the way the paper has been written is not the way to write a manuscript. I suggest getting it written by an expert or a native English speaker before resubmission.

Response 1: To begin with, I would like to thank you very much for your recognition of our research content. We have submitted the manuscript to a professional English-language retouching agency for two rounds of retouching to ensure that it is more natural to write in English. Please accept my apologies. The last manuscript uploaded to the system is not the revised version. My revised manuscript has been uploaded in ‘Microsoft Word template’. Your comments, as well as those of other reviewers, have also been revised and marked up. 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report (Previous Reviewer 1)

The authors have addressed all the concerns, and the revisions are satisfactory. This article can now be accepted for publication in Sustainability.

Reviewer 2 Report (Previous Reviewer 2)

Good job in addressing the reviewers comments.

Reviewer 3 Report (Previous Reviewer 3)

The manuscript has improved from the first draft. It can be accepted.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Data-driven machine learning has attracted significant attention for its great advantage to solve multivariate nonlinear problems, which is realized by building mathematical models to describe relationships between input (influence factors) and output. Machine learning can be applied both for classification and regression and it has been successfully introduced in several fields. In the article “Growing Stock Volume Estimation in Daiyun Mountain Reserve Based on Multiple Linear Regression and Machine Learning”, the authors used Multiple linear regression and machine learning methods to construct a model for Growing Stock Volume estimation in Daiyun Mountain Reserve.  

Given the expediency of providing a report, my comments are somewhat limited, though I hope they are still useful to the editors and authors:

Overall the article is good, the methodology and procedure appear sound and the results are interesting. However, the English write up needs further improvement. This paper can be considered after necessary revisions. The following issues must be addressed and clarified before acceptance of the article.

1.      The novelty of this work is not clear, it must be clearly mentioned in the abstract.

2.      Abstract of this article is not adequate, e.g. the sentences like ‘It can be seen from the map that…’ which refers the reader to other sections/maps must be avoided. The whole abstract must be re-written. It must include briefly the research problem and findings of the study which reflects clearly the novelty of your work.

3.      In machine learning, especially in the classification model, the accuracy could also be high due to class imbalance in the dataset. Could the authors provide more information on the distribution of classes by providing information about confusion matrix? In case there is significant class imbalance, it might also aid their analysis if they perform under/over sampling or selectively penalize misclassification of the minority class.

4.      In the methods section, normally, the experimental procedure/methodology must be comprehensive and detailed enough to understand by a novice and to be easily reproduced by other researchers. It is suggested to describe the procedure adopted, especially in regression/machine learning section, in a further detailed and clear way.

5.      It will further enhance the quality of this work if the current results are compared with some more latest relevant studies.

6.    The English write-up of this paper is weak throughout the manuscript.  It must be improved thoroughly.

Reviewer 2 Report

General Comments

1. Please avoid compound sentences. If possible, kindly split compound sentences into more readable ones.

2. When citing previous studies, kindly use only the last name without including the initials of the 1st author.

3. Consider revising the introduction portion of MLR. Also, kindly write a more coherent discussion about MLR. It can be started with a general description of the technique, assumptions, limitations, advantages, and disadvantages. Kindly also cite similar literature where MLR was used for GSV estimation.

4. On the study site map, please include an inset map showing where the study site is located with respect to mainland China. It can be a 2-level inset showing the relative location of Dehua County with respect to mainland China while the 2nd inset shows the location of Daiyun Mountain Reserve with respect to Dehua County.

5. When presenting figures kindly make the legends more readable.

6. In the MLR section of the Methodology, kindly include the different tests (e.g. multicollinearity) to determine if the MLR model being developed satisfies the assumption of MLR.

7. The manuscript lacks the "Discussion" section

Specific Comments

Lines 22-24: Please provide citation/s on how real-time monitoring provides theoretical and scientific support for the study and preservation of organisms in reserve.

Lines 26-27: Please provide citation/s to support how GSV is being used as a reference standard for assessing dynamic changes and regional vegetation growth

Lines 30-32: Please split this statement into two sentences and provide citations to support each statement.

Line 50: This line can start a new paragraph on the discussion about MLR.

Table 1: Slope and slope directions are not remote sensing data but are derivatives of elevation/altitude data.

 

 

 

 

Reviewer 3 Report

The manuscript is not written clearly and is hard to understand. Further, the methodology is not clearly defined so as the conclusions. Further, the English is poor and the typing mistakes are abundant. It is suggested to rewrite the manuscript completely and resubmit for consideration.

Back to TopTop