Next Article in Journal
A Graphical Tool to Estimate the Air Change Efficiency in Rooms with Heat Recovery Systems
Next Article in Special Issue
Technical, Economic, and Environmental Assessment of a Collective Integrated Treatment System for Energy Recovery and Nutrient Removal from Livestock Manure
Previous Article in Journal
Experimental Study on Physical-mechanical Properties and Fracture Behaviors of Saturated Yellow Sandstone Considering Coupling Effect of Freeze-Thaw and Specimen Inclination
 
 
Article
Peer-Review Record

How the Selection of Training Data and Modeling Approach Affects the Estimation of Ammonia Emissions from a Naturally Ventilated Dairy Barn—Classical Statistics versus Machine Learning

Sustainability 2020, 12(3), 1030; https://doi.org/10.3390/su12031030
by Sabrina Hempel 1,*,†, Julian Adolphs 2,†, Niels Landwehr 2,3, David Janke 1 and Thomas Amon 1,4
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Sustainability 2020, 12(3), 1030; https://doi.org/10.3390/su12031030
Submission received: 19 December 2019 / Revised: 20 January 2020 / Accepted: 22 January 2020 / Published: 31 January 2020

Round 1

Reviewer 1 Report

Article entitled: “How the selection of training data and modeling approach affects the estimation of ammonia emissions from a naturally ventilated dairy barn - classical statistics versus machine learning” has the basic advantage of focusing on a real problem and supporting their study by real data. The Authors made a statistical analysis taking into account the cross-validation. They compared model predictions using 27 different scenarios of temporal sampling, multiple measures of model accuracy and 8 different regression approaches. The presentation of data in tables and drawings is transparent and does not raise any objections.

The Authors estimated that the error of the predicted emission value with the tested measurement protocols was below 20%, but they did not propose a method to reduce the value of this error (except for the need to continue testing). I consider this fact to be the weakness of the reviewed work.

In addition, Authors should improve their diligence in bibliography.

 „Earth Syst. Dynam. 2019, ..., ...” – incomplete data (line 514)

“Agriculture, ecosystems & environment” – should be replaced by “Agriculture, Ecosystems & Environment” (line 517)

“Journal of dairy science” – should be replaced by “Journal of Dairy Science” (line 521)

“Biosystems engineering” – should be replaced by “Biosystems Engineering” (lines 526, 527, 538, 539, 548, 549, 551,

“Science of the total environment” – should be replaced by “Science of the Total Environment” (line 560)

“Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ims 1999 reitz lecture, Sequoia Hall, Stanford University, California, 1999.” (lines: 586, 587) – can be replaced (for example): “Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ims 1999 Reitz lecture, Sequoia Hall, Stanford University, California, 1999 [last access December 2020: http://statweb.stanford.edu/~jhf/ftp/trebst.pdf]

“The bulletin of mathematical biophysics”“The Bulletin of Mathematical Biophysics” (lines: 590, 591)

 

Author Response

Response to Reviewer 1 Comments

We thank the reviewer for the valuable comments which we addressed as described below. In addition changes are marked in the pdf (red for deletion, green for supplement).

Point 1:

Article entitled: “How the selection of training data and modeling approach affects the estimation of ammonia emissions from a naturally ventilated dairy barn - classical statistics versus machine learning” has the basic advantage of focusing on a real problem and supporting their study by real data. The Authors made a statistical analysis taking into account the cross-validation. They compared model predictions using 27 different scenarios of temporal sampling, multiple measures of model accuracy and 8 different regression approaches. The presentation of data in tables and drawings is transparent and does not raise any objections.

The Authors estimated that the error of the predicted emission value with the tested measurement protocols was below 20%, but they did not propose a method to reduce the value of this error (except for the need to continue testing). I consider this fact to be the weakness of the reviewed work.

Response 1:

We’d like to have presented a solution to significantly reduce also the average error of emission predictions from short-term observations as this is clearing of great practical relevance. Unfortunately machine learning alone was not capable to significantly reduce the average error. However, in our study we could show that it at least can reduce the range of predicted emission values when using different time samples that are all in accordance with a selected measurement protocol. By that the robustness of predicted emission values can be highly increased, which already considerably reduces the uncertainty that is induced by the usage of short-term observations.

We agree that reducing the average error below 20% without continuous measurements continues to be a challenge as this value seems to be more related to the short-term variability of some of the included variables or even some variables that have not been included in the model yet. Further detailed studies will be needed to solve this problem, which are, however, out of the scope of our current study in this manuscript.

Point 2:

In addition, Authors should improve their diligence in bibliography.

 „Earth Syst. Dynam. 2019, ..., ...” – incomplete data (line 514)

“Agriculture, ecosystems & environment” – should be replaced by “Agriculture, Ecosystems & Environment” (line 517)

“Journal of dairy science” – should be replaced by “Journal of Dairy Science” (line 521)

“Biosystems engineering” – should be replaced by “Biosystems Engineering” (lines 526, 527, 538, 539, 548, 549, 551,

“Science of the total environment” – should be replaced by “Science of the Total Environment”(line 560)

 “Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ims 1999 reitz lecture, Sequoia Hall, Stanford University, California, 1999.” (lines: 586, 587) – can be replaced (for example): “Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ims 1999 Reitz lecture, Sequoia Hall, Stanford University, California, 1999 [last access December 2020: http://statweb.stanford.edu/~jhf/ftp/trebst.pdf]

 “The bulletin of mathematical biophysics” – “The Bulletin of Mathematical Biophysics” (lines: 590, 591)

Response 2:

We added the missing data for the reference Earth Syst. Dynam. 2019, adjusted the spelling of the mentioned journals and added a note on last access for the lecture reference.

Author Response File: Author Response.pdf

Reviewer 2 Report

Interesting, topical, clearly stated problem and findings.

The English is a bit wobbly and includes a mix of American and British English spellings. Suggest check through this.

Line 85, different settings, but you only looked at one farm so I don't think you say this.

Line 103, littered. I think you need to say what the bedding material was and thickness, this can have a significant impact on methane emissions.

Line 109, More detail o the ration, at least possible confusion as for some corn and maize are synonymous. Presumably you mean grain, but it would be of interest to know what the concentrate part of the TMR was.

Line 228. herd not heard.

Lines 212-216, So which two months were missed out?

 Line 241, clearer to use means rather than averages

Lines 247 and 253, Not sure about the word punish here, and elsewhere, consider "weighted"?

Lines 368-370, Can you express this sentence more clearly? It is a bit of a muddle as it is.

Lines 376 - 378, yes, and maybe extreme values are important.  Actually this is a comment not a question or suggestion.

Line 407. yes,huge, it is true but this is likely I'd have thought given the enormous variation likely.

Lines 427-428, Could you give a couple of contrasting examples of these to illustrate your point?

Lines 432-473 The conclusion is far too long. It is almost the same length as the discussion. Just stick to the main points discovered and their contextualisation and importance. Most of this could go in the discussion.

 

  

Author Response

Response to Reviewer 2 Comments

We thank the reviewer for the valuable comments which we addressed as described below. In addition changes are marked in the pdf (red for deletion, green for supplement).

Point 1:

Interesting, topical, clearly stated problem and findings.
The English is a bit wobbly and includes a mix of American and British English spellings. Suggest check through this.

Response 1:

We apologize that there was some confusion in our phrasing. We rechecked the spelling, corrected some typographical errors and slightly reformulated some sentences for better clarity and readability.

Point 2:

Line 85, different settings, but you only looked at one farm so I don't think you say this.

Response 2:

We added „considering a dataset of one continuously monitored farm“ to make this point clearer.

Point 3:

Line 103, littered. I think you need to say what the bedding material was and thickness, this can have a significant impact on methane emissions.

Response 3:

We agree that bedding material can have a great influence on the total methane emissions and probably even more on the ammonia emissions in our dataset. This effect is expected to mainly influence the individual measurement values and the final aggregated value rather than the capability of the regression method to predict a robust emission value from a given sample of hourly values. However, for the sake of completeness, we added the following information: The cubicles had a deep bedding with a depth of around 0.2 m and bedding material of chopped straw and chalk.

Point 4:

Line 109, More detail o the ration, at least possible confusion as for some corn and maize are synonymous. Presumably you mean grain, but it would be of interest to know what the concentrate part of the TMR was.

Response 4:

We apologize for the confusion. We replaced the ambiguous wording “corn and maize silage” by the average percentages of ingredients in the TMR, namely soy (24\%), oilseed rape (19\%), maize (24\%), rye (23\%), and lupins (10\%).

Point 5:

Line 228. herd not heard.

Response 5:

Presumably, this comment relates to Line 128. We changed „heard“ to „herd“.

Point 6:

Lines 212-216, So which two months were missed out?

Response 6:

The two missing months were basically in autumn. As mentioned in Line 122 gas concentrations were monitored on-farm from November 2016 to September 2017. In consequence, the October was completely missed out. In addition, a large part of September data was not available for our study. To make his point clearer we added a half-sentence to the section 2.2.2 saying measurement period was beginning of November to end of August.

Point 7:

Line 241, clearer to use means rather than averages

Response 7:

We changed “averages” to “means”.

Point 8:

Lines 247 and 253, Not sure about the word punish here, and elsewhere, consider "weighted"?

Response 8:

We changed “punished” to “weighted” throughout the manuscript.

Point 9:

Lines 368-370, Can you express this sentence more clearly? It is a bit of a muddle as it is.

Response 9:

We reformulated the sentence: “This means that the probability to select a suboptimal training period, that is, a training period that leads to a large deviation between estimated and actual measurement value, was high even for a scenario with six measurement periods of two weeks each.”

Point 10:

Lines 376 - 378, yes, and maybe extreme values are important.  Actually this is a comment not a question or suggestion.

Response 10:

We agree.

Point 11:

Line 407. yes,huge, it is true but this is likely I'd have thought given the enormous variation likely.

Response 11:

We agree that the variation is not unexpected. To highlight his point in the paper we added: This uncertainty has to be accepted not only due to the diverse measurement conditions, but also due to a large variation in barn design, management and herd composition in the different studies.

Point 12:

Lines 427-428, Could you give a couple of contrasting examples of these to illustrate your point?

Response 12:

We added two examples to make this point clearer:
For example, if we are interested in an emission factor of a husbandry system a sampling that is associated with a low TAE is valuable.  On the other hand, if we like to predict emissions of a building for certain weather conditions, for example, in order to implement a control strategy in terms of precision farming, the TAE is less important. In such a situation, a temporal sampling strategy that permits to train a model that captures the emission dynamics (e.g., indicated by high R2 values) would be more valuable.

Point 13:

Lines 432-473 The conclusion is far too long. It is almost the same length as the discussion. Just stick to the main points discovered and their contextualisation and importance. Most of this could go in the discussion.

Response 13:

We shortened the conclusion section by about one third, focusing only on the main conclusions. Parts of the text have been moved to results and discussion.

Author Response File: Author Response.pdf

Back to TopTop