Next Article in Journal
Effects of Straw Return with Nitrogen Fertilizer Reduction on Rice (Oryza sativa L.) Morphology, Photosynthetic Capacity, Yield and Water–Nitrogen Use Efficiency Traits under Different Water Regimes
Next Article in Special Issue
Tropical Tree Crop Simulation with a Process-Based, Daily Timestep Simulation Model (ALMANAC): Description of Model Adaptation and Examples with Coffee and Cocoa Simulations
Previous Article in Journal
Using Nitrogen Stable Isotopes to Authenticate Organically and Conventionally Grown Vegetables: A New Tracking Framework
Previous Article in Special Issue
Determination of Cassava Leaf Area for Breeding Programs
 
 
Article
Peer-Review Record

Growth Indexes and Yield Prediction of Summer Maize in China Based on Supervised Machine Learning Method

Agronomy 2023, 13(1), 132; https://doi.org/10.3390/agronomy13010132
by Lijun Su 1,2,*, Tianyang Wen 3, Wanghai Tao 1,*, Mingjiang Deng 1,3, Shuai Yuan 3, Senlin Zeng 3 and Quanjiu Wang 1,3
Reviewer 1:
Reviewer 2:
Agronomy 2023, 13(1), 132; https://doi.org/10.3390/agronomy13010132
Submission received: 26 November 2022 / Revised: 22 December 2022 / Accepted: 27 December 2022 / Published: 30 December 2022
(This article belongs to the Special Issue Recent Advances in Crop Modelling)

Round 1

Reviewer 1 Report

Growth indexes and yield prediction of Summer maize in China based on machine learning method

Dear Authors

The basic science of this paper is conducted in a good way and is of an appropriate standard.  The author and his team write this paper according to journal scope and modern. The author proposed a prediction model based on the machine learning regression algorithm. Firstly, the data pool was constructed by collecting the measured corn test data in the main planting area. The total water input (rainfall plus irrigation water), fertilization, soil quality, and planting density were selected as the training set. I have some minor and major comments.

The title is not good form. The author should add the machine learning method name in the title. Because no one knows which machine learning technique, you used in this study. (Major comment).

The abstract is in a good form but the author should provide the significance of this research at the end of the abstract section.

Research questions are missing in the introduction section.

Objectives are very clear at the end of the introduction section.

In the material section, the author should explain the study area than the data source and methodology

Check the number of the equation in the whole manuscript.

Figures are according to journal criteria.

Results and discussion should be the Result

Line

I hope the authors will resubmit this very soon in this journal.

Best Regards

Line 372: Discussion. This heading section shout is at 4th heading.

The conclusion should be at 5th.

There are many typos mistakes in this whole manuscript.

The author should check the whole manuscript.

Best Regards

Author Response

  1. The title is not good form. The author should add the machine learning method name in the title. Because no one knows which machine learning technique, you used in this study. (Major comment).

Thank you for your suggestion. I revised the title to “Growth indexes and yield prediction of Summer maize in China based on supervised machine learning method”.

  1. The abstract is in a good form but the author should provide the significance of this research at the end of the abstract section.

Thank you for your suggestion. I added the significance of this research at the end of the abstract as follows:

“The supervised machine learning regression algorithm provided a simple method to predict the yield of maize, and optimize the total water inputs and nitrogen applications only using the soil qualities and planting density. ”

  1. Research questions are missing in the introduction section.

Thank you for your suggestion. I rewrote this section.

  1. In the material section, the author should explain the study area than the data source and methodology

Thank you for your suggestion. I deleted the paragraph about the 4.data source and methodology. Because the study areas involved in 35 sites of 13 provinces in China, we cannot introduce the conditions of all sites. So, we summarized the common characteristics of all sites in the Section “2.1 Data source”.

  1. Check the number of the equation in the whole manuscript.

Thank you for your suggestion. I have checked it.

  1. Results and discussion should be the Result

Thank you for your suggestion. I revised it.

  1. Line 372: Discussion. This heading section shout is at 4th heading.

Thank you for your suggestion. I revised it.

  1. The conclusion should be at 5th

Thank you for your suggestion. I revised it.

Reviewer 2 Report

The paper is quite interesting however it appears to be limited by some simplifications and assumptions not properly supported. 

As I understand Figure 2 reports the 236 values from different sites. To make more clear the possible influence of different sites, authors should use different colors or shapes for the points of the graph, bsaed on the different site (or at least on different district). 

Similarly it should be done for figures 5 and 8, using the same colors/shapes.

 

Gaussian predictions gives apparently limited variability (as visible by very limited bands in figures 2a and 2b, 5a and 5b, 8a and 8b). It looks as the cues have been cut out: could you explain the reasons behind this "almost perfectly linear borders" shape?

 

Crop management, mechanization and other technologies have not been considered in the study. On the other hand mechanization and other technologies might have an important effect on crop yield and lai, as also discussed by many papers. see e.g.   

Quantification of mechanization index and its impact on crop productivity and socio-economic factors Abbas, A., Yang, M., Elahi, E., (...), Ahmad, R., Iqbal, T. 2017 International Agricultural Engineering Journal, 26(3), pp. 49-54

Ten years of corn yield dynamics at field scale under digital agriculture solutions: A case study from North Italy, Kayad, A., Sozzi, M., et al. Computers and Electronics in Agriculture, 2021, 185, 106126

Authors should discuss in the paper to which extent different degrees of mechanzation might alter predictions. 

 

What is the validity of the work in a climate change scenario? Could teh approach be pplied to simulate different climate evolutions? This aspect might be stressed and add more value tothe work. 

 

Some "experimental" sites are located in coastal areas: normally coastal areas are characterized by sensibly different climate conditions. Is this affecting results? Nevertheless authors are combining  sites with deeply different geographical characteristics. Are the applied models usable, even though there deep geographical differences? 

 

Other:

Literature review and references are too much biased: more than 60 over 74 references are from Chinese authors, not reflecting actual advancements in agricultural research. 

Line 187: Correct " function had the better fitting" with " function had a better fitting" or " function had the best fitting"

Use the same graph dimensioons for figures 2, 5 and 8

Author Response

  1. As I understand Figure 2 reports the 236 values from different sites. To make more clear the possible influence of different sites, authors should use different colorsor shapes for the points of the graph, bsaed on the different site (or at least on different district). 

Thank you for your suggestion. I redrew the figures and marked the points by different colors and shaped.

  1. Similarly it should be done for figures 5 and 8, using the same colors/shapes.

Thank you for your suggestion. I redrew the figures 5 and 8.

  1. Gaussian predictions gives apparently limited variability (as visible by very limited bands in figures 2a and 2b, 5a and 5b, 8a and 8b). It looks as the cues have been cut out: could you explain the reasons behind this "almost perfectly linear borders" shape?

Thank you for your suggestion. The machine learning theory is not our major, but I try to explain the linear borders from the Gaussian process regression. Briefly, Gaussian process assumes that the regression function f(x) follows Gaussian distributions:

f(x) ~ GP(m(x), K(x,x’))

where m(x) = E(f(x)) and K(x,x’) = E([f(x)-m(x)][f(x’)-m(x’)]’). K(x,x’) is decided by the kernel function and the variance of data set.

Therefore, the deviation of predict values should not be larger than the confidence interval of the function f(x) as shown in Figure. It is obvious that the deviations between the predicted and observed values are limited in the parallel lines.

Moreover, Gaussian process assumes that the errors of regression model also follows the normal distribution.

y = f(x) + e, and e ~ N(0, s2)

So, we delete the sparse points and the corresponding collected data according the original predicted yields of 324 data points as shown in following figure.

 

Finally, we choose 303 data points as the samples, and obtain the new learning regression models of yield. The analytical approaches of leaf area index and dry matter accumulation are same to yield.

 

  1. Crop management, mechanization and other technologies have not been considered in the study. On the other hand mechanization and other technologies might have an important effect on crop yield and lai, as also discussed by many papers. see e.g.   

Thank you for your suggestion. According to the references proposed by reviewer, the mechanization index has a notable relationship with crop yield. Except for data size, the more key factors are considered and the prediction accuracy will be higher. Thus, the deviation of predict values should be caused by the lack of the key factors affecting crop yield. I added the discussion about the influence of crop management, mechanization and other technologies on the crop yield as follows:

“Moreover, the crop management (crop varieties, irrigation and fertilization interval, and organic fertilizer), mechanization and other technologies are also affect the crop growth[76, 77], resulting in the large errors in this study as shown in Table 2. The mechanization index (MI) is defined as the ratio of energy used by machinery to the total energy used by human, animal[78], and it has a notable relationship with crop yield[76]. The high MI means the high crop yield. The reason is that the farmers use the more advanced technologies to manage the crop cultivation comparing to the farmers with small MI. The lack of the key factors affecting crop yield will reduce the model accuracy obviously as shown in Figures 2, 5 and 8. Thus, the more key factors are considered, and the prediction accuracy will be higher.” 

  1. Authors should discuss in the paper to which extent different degrees of mechanzation might alter predictions.

Thank you for your suggestion. See the answer of the previous comment.

  1. What is the validity of the work in a climate changescenario? Could teh approach be pplied to simulate different climate evolutions? This aspect might be stressed and add more value tothe work. 

Thank you for your suggestion. The climate change mainly affects on the temperature and rainfall. I think if we have enough data about temperature and rainfall in the training set, the proposed methods can be applied to simulate the yield for the climate evolution. 

  1. Some "experimental" sites are located in coastal areas: normally coastal areas are characterized by sensibly different climate conditions. Is this affecting results? Nevertheless authors are combining  sites with deeply different geographical characteristics. Are the applied models usable, even though there deep geographical differences? 

Thank you for your suggestion. The supervised machine learning is a method to find the internal connecting link between the independent variables and dependent variable. So, the independent variables in training set are more important, and the prediction accuracy of dependent variable is higher. The deep geographical differences will lead the large errors. But if the training data in the sites with deeply different geographical characteristics are obtained, the proposed method is still useful.  

  1. Literature review and references are too much biased: more than 60 over 74 referencesare from Chinese authors, not reflecting actual advancements in agricultural research. 

Thank you for your suggestion. Because this manuscript is focus on the summer maize in China, the data are collected from 44 literature by Chinese authors.  

  1. Line 187: Correct " function had the better fitting" with " function had a better fitting" or " function had the best fitting"

Thank you for your suggestion. I revised it.

  1. Use the same graph dimensions for figures 2, 5 and 8

Thank you for your suggestion. I revised it.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Agreed about changes in the revised manuscript

Author Response

Dear reviewers and editor,

 

Thank you for working on our manuscript. I revised the manuscript using the “Track Changes” function.

 

  1. I checked figures 2, 5, and 8, and revised the “RMESE”to “RMSE”.

 

  1. I added the units of yieldand Dmax in Table 3, and revised the decimals to integers for Measured and Predicted values.

 

  1. Line 330. “2.1. Model Comparison”was revised to “3.3.1. Model Comparison”

 

  1. Line 348. “2.2. Model Verification”was revised to “3.3.2. Model Verification”

 

  1. Line 365. “3.2.3. Water and nitrogen coupling function”was revised to “3.3.3. Water and nitrogen coupling function”

 

Reviewer 2 Report

Growth indexes and yield prediction of Summer maize in China based on machine learning method

The paper has been clearly improved and is now ready for publication. 

Some minor comments:

Please check figures 2, 5 and 8: it shouold be RMSE rather than RMESE? If so, please correct. 

In table 3 some units are missing

In table 3, I would avoid usinig decimals for Measured and Predicted value of yield and for Measured and Predicted value of Dmax

Author Response

 

Dear reviewers and editor,

 

Thank you for working on our manuscript. I revised the manuscript using the “Track Changes” function.

 

  1. I checked figures 2, 5, and 8, and revised the “RMESE”to “RMSE”.

 

  1. I added the units of yieldand Dmax in Table 3, and revised the decimals to integers for Measured and Predicted values.

 

  1. Line 330. “2.1. Model Comparison”was revised to “3.3.1. Model Comparison”

 

  1. Line 348. “2.2. Model Verification”was revised to “3.3.2. Model Verification”

 

  1. Line 365. “3.2.3. Water and nitrogen coupling function”was revised to “3.3.3. Water and nitrogen coupling function”

 

Back to TopTop