Next Article in Journal
Soil Salinity and Moisture Control the Processes of Soil Nitrification and Denitrification in a Riparian Wetlands in an Extremely Arid Regions in Northwestern China
Next Article in Special Issue
Agro-Waste, a Solution for Rural Electrification? Assessing Biomethane Potential of Agro-Waste in Inhambane Province, Southern Mozambique
Previous Article in Journal
Climatic Influences on Agricultural Drought Risks Using Semiparametric Kernel Density Estimation
Previous Article in Special Issue
How Different Are Manometric, Gravimetric, and Automated Volumetric BMP Results?
 
 
Article
Peer-Review Record

Measurement of Biochemical Methane Potential of Heterogeneous Solid Substrates: Results of a Two-Phase French Inter-Laboratory Study

Water 2020, 12(10), 2814; https://doi.org/10.3390/w12102814
by Thierry Ribeiro 1,*, Romain Cresson 2, Sébastien Pommier 3, Sébastien Preys 4, Laura André 1, Fabrice Béline 5, Théodore Bouchez 6, Claire Bougrier 7, Pierre Buffière 8, Jesús Cacho 7, Patricia Camacho 9, Laurent Mazéas 6, André Pauss 10, Philippe Pouech 11, Maxime Rouez 9 and Michel Torrijos 12
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Water 2020, 12(10), 2814; https://doi.org/10.3390/w12102814
Submission received: 27 July 2020 / Revised: 30 September 2020 / Accepted: 2 October 2020 / Published: 10 October 2020

Round 1

Reviewer 1 Report

This manuscript describes an interesting interlaboratory study that adds to the limited available information on precision of BMP measurement. But the manuscript has several significant problems that need to be addressed.

1. Organization. What appear to be methods are described in the results section. Some sentences are repeated exactly.
2. Presentation of results. A single type of plot is used repeatedly, along with some simple tables. Some responses are discussed but the underlying data are not shown.
3. Discussion. It largely reads like an introduction, and does not include implications or limitations of the work.

These points are addressed in more detail below, where you can find other comments as well. Note that I have used "a -> b" to mean "change a to b" in some places.

Title. Biological -> Biochemical (as in your text). Otherwise it is OK as is, although it is not very descriptive and "Results from a" does not add much. You might consider adding the terms "measurement" and "precision" somehow.

Abstract. Missing a conclusion or some connection between the background/motivation and your results. For example, it does not include the important result that a harmonized protocol did not improve precision.

Throughout. Minor wording problems. Please closely check language. See examples:
L 54. was -> is
L 68. "we" -- is this meant to be a personal observation?
L 108 problematic -> problem (or question or topic)
L 135 "twenty years (twenty past years)"
L 382 few -> little
L 420 excess -> exceed
L 475 delete one "monohydrate"

L 106, 149 and perhaps elsewhere. Why is "inoculum" in italics?

L 108-128. Is Mottet et al. (ref. 37) the correct citation? This paper seems to be on a different topic and I do not see the results you describe. Also, assuming the citation can be corrected, I wonder if all this detail is necessary, especially if it is given in the paper you will (ultimately) cite. Could it be written more concisely? Additionally, a recent paper covers this issue in some detail, and could be used here: Raposo et al. https://doi.org/10.1016/j.rser.2020.109890. Actually you cite it as ref. 13 but not in this part of the introduction. Please take a closer look and see if it is relevant for this part. And consider that the focus of this section is primarily on consistency or completeness reporting of BMP test methods--perhaps information on which test components influence BMP would be more relevant.

L 122. "The quality of the measurement was evaluated with the use of blank. . . controls" Is this correct? It is my understanding that blanks are used simply to determine endogenous CH4 production. Also the "and/or" suggests that blanks and positive controls could be interchangeable, which is not the case.

Introduction. In your review of interlaboratory studies you are missing another recent study here: Hafner et al. doi:10.3390/w12061752. Actually it is cite in the discussion as reference 54. Please take a closer look at the paper and see if there is information relevant to your introduction.

L 196-204. These objectives are not completely consistent with the abstract, which suggests the objective was to test a harmonized protocol.

L 250-252. Not clear enough. Were TS and VS measured by all labs? If so is this reproducibility you describe inter-laboratory? If so, shouldn't it be in the results section?

Table 1. Did you determine elemental composition and if so could you provide calculated maximum theoretical BMP? This would be interesting to have when evaluating the measured BMP values. Even without elemental composition, you could make an estimate based on nutritional information (perhaps including some reasonable guesses) as described in your ref. 53 https://doi.org/10.3390/w12051223, although the uncertainty may be too high.

L 260-261. Correct described order, first dried then ground. Note grinded -> ground. Also, do you have reference that shows no loss of volatiles at 80C?

L 266 and elsewhere. Consider "bucket" -> "bottle", depending on what was actually used.

L 270-271. Can you provide a reference for the mineral solution? What was it based on?

L 276-278. Was this done only by one laboratory? Clarify.

Section 2.4. Did the individual labs calculate BMP for each individual replicate and then submit those data? Did you provide any guidelines on how the calculation should be carried out or was it up the labs (and assumed to be simple enough that information was not needed or does not need to be described)?

L 299-305. This discussion on limitations of kinetic information from BMP tests seems out of place. Move to introduction if you keep it. Consider whether it adds to the manuscript. Also, this topic is discussed in other work including a recent paper by Koch et al. https://doi.org/10.3389/fenrg.2020.00063.

Figure 2. I don't think this is necessary. It really only shows that the second BMP test was started 4 weeks after the first. It is not clear what you mean by BMP1, BMP2, etc. Individual bottles/replicates? How about blanks and positive control samples?

Figure 3. Consider moving to results. Could you be more specific about manual methods, e.g., were they all manometric?

L 332-334. Was this done by each laboratory? Or did they report as-measured volume and you corrected values? If the latter, do you have reference for this calculation? This seems to be some variation in how it is carried out.

S 2.5. Consider briefly describing the calculations because they are not so clear in these ISO documents, which, furthermore, are not accessible to everyone. I believe these calculations are relatively simple, e.g., repeatability standard deviation was calculated as the substrate mean of standard deviation values calculated from 3 replicates for each lab x test combination, etc. But some more clarity could be good for readers without experience in this area or familiarity with the ISO standards. Also, some details are not clear. Is it correct that error from subtraction of endogenous production was not included in your precision estimates?

L 347. Can you be more specific about what you mean by "satisfactory".

L 348-353. Treatment of outliers is an important topic, as you imply. In the least, more details on how outliers were identified is needed, along with information on the number of outliers removed in the results section (I see around line 440). This information belongs in the methods section in my opinion. You might also consider repeating calculations with the suspected outliers included. It may not be correct that these outliers would never be provided to a customer.

Fig. 4 and similar plots. These have some problems:
* x axis label missing (presumably 1-11 are the different labs)
* It is not clear what results are shown here--BMP calculated by individual bottle (assay) (i.e., n = 6) and then summarized?
* What do different parts of plot show? Presumably mean, median, quartiles, extreme values, but please describe, perhaps in the methods section.

These plots show well variability within labs (e.g., by the size of each box). But if I understand correctly, they show what you referred to as repeatability and intra-laboratory reproducibility combined, which obscures information. Could you, instead, use different symbols for sets A and B, i.e., two boxes for each lab shown in different colors. They also provide some information on correlation among substrates within a lab, i.e., lab 4 shows low results for both SA, SA', but not SB. Finally, they show variability among labs (different boxes etc.). So they are useful figures but some results could be more clearly shown with different plots. These would include some statements in the results on the apparent effects (or lack of effects) of measurement method or other factors. Adding more information to the existing plots is an alternative for some results. For example, using different colors for the different measurement methods would allow readers to compare them. For others, new plots or tables would be helpful. For example, comparison of SA and SA' is probably best done using a scatterplot (SA BMP on one axis, SA' on another, points for each lab). Please consider how you can more clearly show the important results.

L 354-357 and 441-445. Details on these precautions for the statistical analysis are needed.

L 356. workforce -> sample size?

L 360-362. "For other factors like the assessment method. . . " No ANOVA or other results are shown to support this statement. If you think this is a significant result, please support it by showing at least mean values and p values from hypothesis tests or similar information (i.e., standard error estimate for comparison), possibly along with a figure or table. As part of this information, please include the sample sizes for the different factor levels (so far this is only given for measurement type in Fig. 2). If you do not do not believe these relationships and others (see related comments below) are meaningful (after all, the sample size is quite small (n = 3 for AMPTS) and a repeated/hierarchical/nested structure as you pointed out L 354-357, in addition to this, factors levels were not randomly assigned) then why mention them at all? In that case simply state that there was no attempt to assess the impact of test factors on BMP and briefly explain why.

L 362-363. If you conclude that there was no detectable effect for the effect of freezing etc. (SA vs. SA'), then support this with results from your statistical analysis. Include the mean difference and a p value at least. And did you use a paired test here, e.g., include "lab" as a factor? The design is clearly paired/"repeated measures" so you should take advantage of the higher power this provides.

L 375. samples -> substrates?

L 381-388. Show more details to support these results. See comment above for L 360-362. Additionally, some information is repeated here (SA vs. SA').

Section 3.2. This material seems to be part of the methods, and I think it should be moved to that section.

L 401-402. Why capitalize "Mineral" etc.?

L 403. Why capitalize "Carbonate"?

L 404-405. Clarify, so I:S could be 2 or 5 for some substrates, i.e., some labs used 2 and some used 5? Which substrate(s)?

L 409. Is "blank" an appropriate term for a positive control? To me "blank" implies no sample material.

L 406-409. Does this mean only one bottle each was used for the negative and one for the positive control? Clarify. If so this is quite different from other recommendations and problematic. In particular this provides no means to estimate precision of the endogenous production. This issue requires some discussion.

L 414. Why use "CV" here when "RSD" is used elsewhere? Also applies to lines 434 and 436.

L 416-419. Considering that positive control results were discarded, does it make sense to keep this criterion?

L 423. excess -> exceed.

L 422-423. As I understand what you've written you mean the time at which the BMP itself increases by less than 1% in a day. Consider whether this is written clearly enough, especially because other, slightly different, criteria are in use.

L 414-423. Were these criteria evaluated by each participating lab or after submission by the organizers?

L 424. How many tests were discarded by participating labs?

L 425-431. This text is a duplicate of text around line 348. Can you reorganize the text so you do not need to repeat it? As mentioned above, I think information on how outliers were identified and handled belongs in the methods.

L 439. Vague. Give more details.

L 465-466. Unclear what this means. By "will be" do you mean in future work or below?

L 468-477. This discussion on suspected moisture absorption by sodium acetate, and the resulting poor quality of all positive control results, is not very convinving to me. Were results more variable than for the complex substrates? Was the acetate stored in open containers? Would moisture also be a problem for the complex substrates? Anyway, if you did not use the sodium acetate results at all, perhaps just mention the substrate (and the problems) earlier on and note that the data were not used and will not be presented. Then avoid repeating it.

L 478-482. Microcrystalline cellulose is a popular positive control and has been recommended. It would make sense to discuss it here.

L 484-494. Show these results. In the least, mean differences (at least one is given for the AMPTS result) and p values would be helpful. Plots and tables might be useful as well.

Discussion. Much of this repeats information given elsewhere. In places it seems more like an introduction, and an extended abstract elsewhere. Consider removing the repeated parts and much of the review. Here is what Water gives in the guide for authors:

"Discussion: Authors should discuss the results and how they can be interpreted in perspective of previous studies and of the working hypotheses. The findings and their implications should be discussed in the broadest context possible and limitations of the work highlighted. Future research directions may also be mentioned. This section may be combined with Results."

In your discussion text, connections to the introduction and objectives could be more clear. For example, you stress earlier in the paper that this study uniquely measured intra-laboratory reproducibility but what is, ultimately, the significance of the estimates? A discussion of limitations and what is needed in future work would be useful also. I recommend discussing issues related to positive controls and lack of replication of the blanks (if I have understood your description correctly). Additionally, the limitations of protocol standardization/harmonization alone and a possible need for effective validation criteria seems relevant here. Please revise. Also consider whether a combined results and discussion section would be more effective.

L 552-559. This seems more like a typical discussion. But I don't think it is true that the microbial consortium (i.e., the inoculum) is the only part of the tests that differ among labs. There could be bias in the various measurement methods used, or the particular way each laboratory applies them. Some assessment of overall measurement bias for laboratories (e.g., the bivariate plots mentioned above would more clearly show whether labs tend to be high/low for all substrates or whether their response varies among substrates) could be helpful here. Some more discussion on studies that assessed inoculum effects on BMP could be useful here. Consider these papers: https://doi.org/10.1016/j.biortech.2017.06.142, https://doi.org/10.1111/1751-7915.12268, https://doi.org/10.3390/app10072589, https://doi.org/10.1016/j.biortech.2013.07.051, https://doi.org/10.1016/j.biortech.2012.01.025, https://doi.org/10.3390/w12061752 .

L 560-563. Do you think this drying conclusion is a general one, that would apply to other substrates? Has it been assessed in other studies, with other substrates?

Author Response

Reviewers' comments:

Thank you for the thorough review of our manuscript. We would like to thank all the reviewers for their detailed comments which significantly contribute to improve the manuscript. All the reviewer´s comments were taking into account and the manuscript was corrected and adjusted accordingly.

Review report #1

Comments and Suggestions for Authors

This manuscript describes an interesting interlaboratory study that adds to the limited available information on precision of BMP measurement. But the manuscript has several significant problems that need to be addressed.

  1. Organization. What appear to be methods are described in the results section. Some sentences are repeated exactly.
    2. Presentation of results. A single type of plot is used repeatedly, along with some simple tables. Some responses are discussed but the underlying data are not shown.
    3. Discussion. It largely reads like an introduction, and does not include implications or limitations of the work.

These points are addressed in more detail below, where you can find other comments as well. Note that I have used "a -> b" to mean "change a to b" in some places.

Dear reviewer,

We would like to thank you for your comments and questions. We have considered all of them in order to present a well-organized results and discussion sections. Some responses to your questions have been added to the manuscript.

Title. Biological -> Biochemical (as in your text). Otherwise it is OK as is, although it is not very descriptive and "Results from a" does not add much. You might consider adding the terms "measurement" and "precision" somehow.

Authors agree with this comment. The authors provide this new title “Measurement of Biochemical Methane Potential of Heterogeneous Solid Substrates: Results of a Two Phases French Inter-laboratory Study” considering the valuable proposition of the reviewer

Abstract. Missing a conclusion or some connection between the background/motivation and your results. For example, it does not include the important result that a harmonized protocol did not improve precision.

Thanks for this comment. This abstract was modified.

Throughout. Minor wording problems. Please closely check language. See examples:

Thanks for these comments. The changes were done in the revised version of the manuscript.
L 54. was -> is done.
L 68. "we" -- is this meant to be a personal observation? Done.

L 108 problematic -> problem (or question or topic) done.
L 135 "twenty years (twenty past years)" done.
L 382 few -> little done.
L 420 excess -> exceed done.
L 475 delete one "monohydrate" done.

L 106, 149 and perhaps elsewhere. Why is "inoculum" in italics?

Thanks for this comment. The manuscript was carefully checked and “inoculum” is now correctly written in the whole revised version.

L 108-128. Is Mottet et al. (ref. 37) the correct citation? This paper seems to be on a different topic and I do not see the results you describe. Also, assuming the citation can be corrected, I wonder if all this detail is necessary, especially if it is given in the paper you will (ultimately) cite. Could it be written more concisely? Additionally, a recent paper covers this issue in some detail, and could be used here: Raposo et al. https://doi.org/10.1016/j.rser.2020.109890. Actually you cite it as ref. 13 but not in this part of the introduction. Please take a closer look and see if it is relevant for this part. And consider that the focus of this section is primarily on consistency or completeness reporting of BMP test methods--perhaps information on which test components influence BMP would be more relevant.

Authors agree with this comment and apologize for this mistake. The reference [37] Mottet et al., 2010 cited here is effectively not the correct citation for this part. This reference corresponds to the PhD thesis of Alexis Mottet entitled in French: “Alexis Mottet. Recherche d’indicateurs de biodégradabilité anaérobie et modélisation de la digestion anaérobie thermophile: Application aux boues secondaires d’épuration non traitées et prétraitées thermiquement. Sciences du Vivant [q-bio]. Université Montpellier 2 (Sciences et Techniques), 2009. Français.” This part was partially deleted and rearranged including the Ref 14 Raposo et al., 2020.

L 122. "The quality of the measurement was evaluated with the use of blank. . . controls" Is this correct? It is my understanding that blanks are used simply to determine endogenous CH4 production. Also the "and/or" suggests that blanks and positive controls could be interchangeable, which is not the case.

Authors agree with this comment. The blanks with inoculum are effectively only used to determine the endogeneous methane production and positive controls are only used to determine inoculum activity and to validate the test. According to the previous modification, this part was now removed of the revised version.

Introduction. In your review of interlaboratory studies you are missing another recent study here: Hafner et al. doi:10.3390/w12061752. Actually it is cite in the discussion as reference 54. Please take a closer look at the paper and see if there is information relevant to your introduction.

Authors agree with this valuable comment. This study was now also considered in the introduction section.

L 196-204. These objectives are not completely consistent with the abstract, which suggests the objective was to test a harmonized protocol.

Authors agree with this comment. Abstract is completely modified as suggested by the reviewer.

L 250-252. Not clear enough. Were TS and VS measured by all labs? If so is this reproducibility you describe inter-laboratory? If so, shouldn't it be in the results section?

Authors agree with this valuable comment. This point is now included in results section and hopefully more clear. The TS and VS were effectively measured by all the labs when starting the experiments during the two phases. The TS and VS given in this part were from literature and were provided as information by the lab preparing the substrates sent to all participants. The TS and VS measurements carried out by all participants were given now in the Results section with a dedicated Table, including the theoretical BMP values given as theoretical targets. The reproducibility was given in this Table for each substrate and for TS and VS measurements.

Table 1. Did you determine elemental composition and if so could you provide calculated maximum theoretical BMP? This would be interesting to have when evaluating the measured BMP values. Even without elemental composition, you could make an estimate based on nutritional information (perhaps including some reasonable guesses) as described in your ref. 53 https://doi.org/10.3390/w12051223, although the uncertainty may be too high.

Thanks for this comment. No elemental composition was determined but theoretical BMP were calculated as described in Appendix A and given in Table A1 to A4 and plotted in the Figures 4a, 4b, 4c and 7a, 7b, 7c showing the intra-laboratory reproducibility RSD.

L 260-261. Correct described order, first dried then ground. Note grinded -> ground. Also, do you have reference that shows no loss of volatiles at 80C?

Authors agree with this comment and apologize for this mistake. The order of preparation steps was changed. The loss of dry matter and volatile/organic compounds was described in some old studies dealing with the improvement of method for determination of dry matter in silage for feed. For example, Minson and Lancaster (1963) shown that oven drying at l00°C led to dry matter losses of up to 16 per cent depending on the quantity of organic acids present. At lower (40°C, 70°C) oven drying temperatures smaller losses occurred.

Similar studies shown same conclusion:

Fenner, H., Barnes, H.D. Improved method for determining dry matter in silage. J. Dairy Sci. 1965, 8(10):1324–1328.

McDonald, P., Dewarw, A., 1960. Determination of dry matter and volatiles in silages. J. Sci. FdAgric. 1960. 11, 566-57.

  1. J. Minson, D.J., Lancaster, R.J. The effect of oven temperature on the error in estimating the dry matter content of silage, New Zealand Journal of Agricultural Research, 1963. 6:1-2, 140-146.

L 266 and elsewhere. Consider "bucket" -> "bottle", depending on what was actually used.

Authors disagree with this proposition. For us, bucket is the right term, mayonnaise was purchased from food cash-discounter in buckets and straw was packed in closed buckets.

L 270-271. Can you provide a reference for the mineral solution? What was it based on?

The mineral solution was prepared by one lab and sent to all participants. This mineral solution was prepared according to the recommendations of Angelidaki et al., 2004; Angelidaki et al., 2009 themselves adapted from Madigan et al., 2000. This last reference “Madigan MT, Marinko JM & Parker J (2000) Brock Biology of Microorganisms, 9th edn. Prentice Hall, NY” was now cited in the text and added in the list of references.

 

L 276-278. Was this done only by one laboratory? Clarify.

The information about TS and Vs was removed in this section and only the composition was now given here in Table 1. The TS and VS determination was carried out by all the participants. These analyses were carried out in triplicate in each lab for all substrates allowing to provide reproducibility values for all substrates, these data were now given in Table 3 in the Results section.

Section 2.4. Did the individual labs calculate BMP for each individual replicate and then submit those data? Did you provide any guidelines on how the calculation should be carried out or was it up the labs (and assumed to be simple enough that information was not needed or does not need to be described)?

Authors agree with this comment. All details are now provided in the revised version. Data were collected with Excel files sent to each participant and for the two steps.

L 299-305. This discussion on limitations of kinetic information from BMP tests seems out of place. Move to introduction if you keep it. Consider whether it adds to the manuscript. Also, this topic is discussed in other work including a recent paper by Koch et al. https://doi.org/10.3389/fenrg.2020.00063.

Authors agree with this valuable comment. This discussion is now removed in the revised version of the manuscript.

Figure 2. I don't think this is necessary. It really only shows that the second BMP test was started 4 weeks after the first. It is not clear what you mean by BMP1, BMP2, etc. Individual bottles/replicates? How about blanks and positive control samples?

Authors disagree with this comment and wish to keep it, but the Figure 2 was modified according to the comments of the reviewer concerning replicates, blanks (inoculum) and positive control samples.

Figure 3. Consider moving to results. Could you be more specific about manual methods, e.g., were they all manometric?

Authors agree with this comment. Figure 3 was moved to Results section. Manual methods were not all manometric, some of them were also volumetric (measurement of volume displacement). Quantification of automatic vs manual, and manometric vs volumetric were given now in Table 8 giving details and sample size of ANOVA with 4 factors for SA’, SB and SC.

L 332-334. Was this done by each laboratory? Or did they report as-measured volume and you corrected values? If the latter, do you have reference for this calculation? This seems to be some variation in how it is carried out.

Authors agree with this comment. All details are now provided in the revised version. Data were collected with Excel files sent to each participant and for the two steps. All the values were directly corrected (temperature and pressure) in the Excel files.

S 2.5. Consider briefly describing the calculations because they are not so clear in these ISO documents, which, furthermore, are not accessible to everyone. I believe these calculations are relatively simple, e.g., repeatability standard deviation was calculated as the substrate mean of standard deviation values calculated from 3 replicates for each lab x test combination, etc. But some more clarity could be good for readers without experience in this area or familiarity with the ISO standards. Also, some details are not clear. Is it correct that error from subtraction of endogenous production was not included in your precision estimates?

Authors agree with this comment. The calculations were now detailed in Material and Methods section.

L 347. Can you be more specific about what you mean by "satisfactory".

Thanks for this comment. Repeatability and intra-reproducibility were now given with graphs showing the comparison between sets A and B for both phases. This point is now discussed in the revised version.

L 348-353. Treatment of outliers is an important topic, as you imply. In the least, more details on how outliers were identified is needed, along with information on the number of outliers removed in the results section (I see around line 440). This information belongs in the methods section in my opinion. You might also consider repeating calculations with the suspected outliers included. It may not be correct that these outliers would never be provided to a customer.

Authors agree with this comment. The calculations were now detailed in Material and Methods section.

Fig. 4 and similar plots. These have some problems:
* x axis label missing (presumably 1-11 are the different labs)
Thanks for this comment. Sorry for forgetting this. X axis label is now added in the corresponding figures in the revised version.

* It is not clear what results are shown here--BMP calculated by individual bottle (assay) (i.e., n = 6) and then summarized?

* What do different parts of plot show? Presumably mean, median, quartiles, extreme values, but please describe, perhaps in the methods section.

Authors agree with this comment. A description of boxplots and their different parts is now added in the revised version before the Figure 5.

These plots show well variability within labs (e.g., by the size of each box). But if I understand correctly, they show what you referred to as repeatability and intra-laboratory reproducibility combined, which obscures information. Could you, instead, use different symbols for sets A and B, i.e., two boxes for each lab shown in different colors. They also provide some information on correlation among substrates within a lab, i.e., lab 4 shows low results for both SA, SA', but not SB. Finally, they show variability among labs (different boxes etc.). So they are useful figures but some results could be more clearly shown with different plots. These would include some statements in the results on the apparent effects (or lack of effects) of measurement method or other factors. Adding more information to the existing plots is an alternative for some results. For example, using different colors for the different measurement methods would allow readers to compare them. For others, new plots or tables would be helpful. For example, comparison of SA and SA' is probably best done using a scatterplot (SA BMP on one axis, SA' on another, points for each lab). Please consider how you can more clearly show the important results.

Authors agree with this comment. New figures (Fig 4a,b,c and Fig. 7a,b,c) were added that provided the results for the intra-laboratory reproducibility and a scatterplot that shown the correlation between SA and SA’.

L 354-357 and 441-445. Details on these precautions for the statistical analysis are needed.

Thanks for this comment. Details were now included in the Material and Methods section.

L 356. workforce -> sample size?

Thanks for this comment. This word has been changed in revised version.

L 360-362. "For other factors like the assessment method. . . " No ANOVA or other results are shown to support this statement. If you think this is a significant result, please support it by showing at least mean values and p values from hypothesis tests or similar information (i.e., standard error estimate for comparison), possibly along with a figure or table. As part of this information, please include the sample sizes for the different factor levels (so far this is only given for measurement type in Fig. 2). If you do not do not believe these relationships and others (see related comments below) are meaningful (after all, the sample size is quite small (n = 3 for AMPTS) and a repeated/hierarchical/nested structure as you pointed out L 354-357, in addition to this, factors levels were not randomly assigned) then why mention them at all? In that case simply state that there was no attempt to assess the impact of test factors on BMP and briefly explain why.

Authors agree with this comment. All the sample sizes were now provided in the revised version in dedicated tables for expected, removed and considered values for the statistical curation. Reasons of taking precaution for the statistical quantification of significance were also added in the Results section in revised version.

“Results for the statistical quantitation of significativity for the different factors obtained by ANOVA should be interpreted with caution. The measurements were taken from actual laboratory practices, therefore formed an incomplete and unbalanced experiment design, with no randomly assignment for the factor levels. The sample size for some factor levels, especially for the experimental system (Manual/AMPTS), was for some factors weak. Thereby, a certain number of precautions were applied during the statistical analysis in order to particularly dismiss the terms with too small a workforce, and to consider the nested nature of certain factors (method / gas measurement / agitation).”

please include the sample sizes for the different factor levels

Some new tables (Table 4 and Table 6) were now introduced with the details of the sample sizes as requested.

L 362-363. If you conclude that there was no detectable effect for the effect of freezing etc. (SA vs. SA'), then support this with results from your statistical analysis. Include the mean difference and a p value at least. And did you use a paired test here, e.g., include "lab" as a factor? The design is clearly paired/"repeated measures" so you should take advantage of the higher power this provides.

Authors agree with this valuable comment. This part was modified in the revised version. Results of ANOVA with 2 factors and interaction are now given and show no difference between SA and SA’, considering the p-value of 0.563 obtained for the factor Sample.

No paired tests were carried out in our study as we understand the question.

L 375. samples -> substrates?

The change have been carried out.

L 381-388. Show more details to support these results. See comment above for L 360-362. Additionally, some information is repeated here (SA vs. SA').

Please refer to our answer mentioned before for previous comment about L 360-362. Repetition are now deleted.

Section 3.2. This material seems to be part of the methods, and I think it should be moved to that section.

One part could be moved to Methods but not the other part due to the fact that these consideration were decided from the discussion of the results obtained in the first phase.

L 401-402. Why capitalize "Mineral" etc.?

Thanks for this comment. The modification has been carried out.

L 403. Why capitalize "Carbonate"?

Thanks for this comment. The modification has been carried out.

L 404-405. Clarify, so I:S could be 2 or 5 for some substrates, i.e., some labs used 2 and some used 5? Which substrate(s)?

Thanks for this comment. All labs used a I/S ratio of 2. This point was now clarified and the sentence is now modified in revised version. The proposition of using 5 is only for specific substrate (not in this study).

 "blank" an appropriate term for a positive control? To me "blank" implies no sample material.

Authors agree with this valuable comment and with this fact that “blank” implies no sample material. It was clarified and modified in the revised version of the manuscript. Blank was effectively not the appropriate term for a positive control and it’s a wrong use of term. The term positive control is effectively most appropriate.

L 406-409. Does this mean only one bottle each was used for the negative and one for the positive control? Clarify. If so this is quite different from other recommendations and problematic. In particular this provides no means to estimate precision of the endogenous production. This issue requires some discussion.

Blanks and positive control substrate were achieved in triplicate. This point is now clarified in the revised version.

L 414. Why use "CV" here when "RSD" is used elsewhere? Also applies to lines 434 and 436.

Authors agree with this comment and apologize for this mistake. CV is replaced by RSD in this part in the revised version.

L 416-419. Considering that positive control results were discarded, does it make sense to keep this criterion?

Authors agree with this comment. The text is now modified.

L 423. excess -> exceed.

Thanks for this comment. The modification has been carried out.

L 422-423. As I understand what you've written you mean the time at which the BMP itself increases by less than 1% in a day. Consider whether this is written clearly enough, especially because other, slightly different, criteria are in use.

Authors agree with this comment. The text was now modified according to the comment. 

L 414-423. Were these criteria evaluated by each participating lab or after submission by the organizers?

Thanks for this comment. This point is now clearly explained in Material and Methods section.

L 424. How many tests were discarded by participating labs?

Thanks for this comment. The number of expected, discarded and considered values by participating labs are now clearly mentioned in Tables 4 and 6.

L 425-431. This text is a duplicate of text around line 348. Can you reorganize the text so you do not need to repeat it? As mentioned above, I think information on how outliers were identified and handled belongs in the methods.

Authors agree with this comment. The text is now reorganized as requested by the reviewer.

L 439. Vague. Give more details.

Authors agree with this comment. The text is now clarified as requested by the reviewer.

L 465-466. Unclear what this means. By "will be" do you mean in future work or below?

Authors agree with this comment. The effect of the inoculum is now better discussed in the revised manuscript.

L 468-477. This discussion on suspected moisture absorption by sodium acetate, and the resulting poor quality of all positive control results, is not very convinving to me. Were results more variable than for the complex substrates? Was the acetate stored in open containers? Would moisture also be a problem for the complex substrates? Anyway, if you did not use the sodium acetate results at all, perhaps just mention the substrate (and the problems) earlier on and note that the data were not used and will not be presented. Then avoid repeating it.

Authors agree with this comment. Results were not more variable than for the complex substrate. Monohydrate sodium acetate is more hygroscopic than the dried and shredded substrate SA’. The acetate was purchased from one commercial lot, aliquoted in small boxes without desiccant then sent to all participants, and finally stored at room temperature. The hydratation of sodium acetate was just a hypothesis. As suggested, we removed it in the text.

L 478-482. Microcrystalline cellulose is a popular positive control and has been recommended. It would make sense to discuss it here.

Authors agree with this comment. This point is now discussed in the revised version.

L 484-494. Show these results. In the least, mean differences (at least one is given for the AMPTS result) and p values would be helpful. Plots and tables might be useful as well.

Authors agree with this comment. The results of ANOVA with 4 factors carried out for the 3 substrates are now included in this part, showing p-values < 0.005 for the variable Method  (automatic vs manual) and a discussion about the significance of such results obtained from a low statistical size is also added.

Discussion. Much of this repeats information given elsewhere. In places it seems more like an introduction, and an extended abstract elsewhere. Consider removing the repeated parts and much of the review. Here is what Water gives in the guide for authors:

"Discussion: Authors should discuss the results and how they can be interpreted in perspective of previous studies and of the working hypotheses. The findings and their implications should be discussed in the broadest context possible and limitations of the work highlighted. Future research directions may also be mentioned. This section may be combined with Results."

Authors agree with this valuable comment. Discussion is now merged with Results as recommended by the reviewer.

In your discussion text, connections to the introduction and objectives could be more clear. For example, you stress earlier in the paper that this study uniquely measured intra-laboratory reproducibility but what is, ultimately, the significance of the estimates? A discussion of limitations and what is needed in future work would be useful also. I recommend discussing issues related to positive controls and lack of replication of the blanks (if I have understood your description correctly). Additionally, the limitations of protocol standardization/harmonization alone and a possible need for effective validation criteria seems relevant here. Please revise. Also consider whether a combined results and discussion section would be more effective.

Authors agree with this comment. Results and Discussion are now merged in the revised version, hoping that the understanding is improved for the reader.

L 552-559. This seems more like a typical discussion. But I don't think it is true that the microbial consortium (i.e., the inoculum) is the only part of the tests that differ among labs. There could be bias in the various measurement methods used, or the particular way each laboratory applies them. Some assessment of overall measurement bias for laboratories (e.g., the bivariate plots mentioned above would more clearly show whether labs tend to be high/low for all substrates or whether their response varies among substrates) could be helpful here. Some more discussion on studies that assessed inoculum effects on BMP could be useful here. Consider these papers: https://doi.org/10.1016/j.biortech.2017.06.142, https://doi.org/10.1111/1751-7915.12268, https://doi.org/10.3390/app10072589, https://doi.org/10.1016/j.biortech.2013.07.051, https://doi.org/10.1016/j.biortech.2012.01.025, https://doi.org/10.3390/w12061752 .

Authors agree with this comment. The text was modified in the revised version and some references proposed by the reviewer and some others were added.

L 560-563. Do you think this drying conclusion is a general one, that would apply to other substrates? Has it been assessed in other studies, with other substrates?

We do not know if that would be apply to other substrates, probably yes, with exception for substrate that contains volatile compounds such VFA for example.

 

 

Review report #2

Comments and Suggestions for Authors

The study presented is of high relevance for the research community since the standardisation/harmonisation of BMP tests requires a large contribution of different laboratories that use this method.

There are several aspects that can/should be improved:

  • The introduction is in parts formulated very detailed and might be shorted. A table summarizing the most important aspects of the different studies mentioned could help the reader to oversee the different initiatives on BMP standardisation and their main conclusions. The latest large interlaboratory studie (as mentioned in reference 54) should be described in the introduction already.

Authors agree with this comment. The introduction was shortened for some points. About the request of table summarizing the most important aspects of the different studies mentioned, it was done in recent papers such in Ohemeng-Ntiamoah and Datta, 2019; Filer et al., 2019; Raposo et al., 2020; Hafner et al., 2020 and we think that it’s not useful to reproduce it.

  • The language has to be revised (e.g. use of colloquial language as "...reproducibility calculated were unpleasantly of the..." l. 456, "...the criteria were bitterly and for a long time debated..." l. 399. Also some odd formulations are used: e.g. l.317 "...were asked to satisfy the following rules...")

Authors agree with this comment. Language is now revised.

  • The first six paragraphs of the discussion (l.496-l.551) are rather a summary of the results than a discussion. The own results should be set into context with other initiatives as mentioned in the introduction.

Authors agree with this comment. The text was now modified and shortened as requested in the revised version.

  • In l. 554 is the nature of the microbial consortium mentioned as a reason for the inter-laboratory viriability:

In [54], a  international study with a large data set and inoculum exchange an impact of the inoculum was not observed, however errors in data processing are mentioned as a reason for variability, which has not been discussed in this manuscript.

Authors agree with this comment. This is written in the revised version. The data processing is now detailed in Material and Methods section and we do not think that the observed variability can be explained by the curation of the data results.

  • A summary/conclusion section is missing.

Discussion section was now merged with Results section for a better understanding and as suggested by reviewers. The conclusion section was now added in the revised version.

Some further minor comments:

  1. 135: ...past twenty years (twenty past years)... It was modified in the revised version.
  2. 149-152: The sentence is unclear. It was modified in the revised version.
  • l297: "The test was therefore generally carried out in batch mode..." Isn't the BMP always determined in batch mode?

Authors fully agree with this comment. The corresponding paragraph is now removed in the revised version.

  • 356: "...the statistical analysis in order to particularly dismiss the terms with too small a workforce..." What is meant by that?

Authors agree with this comment and apologize for the confusion. The sentence is now modified in the revised version.

l.381-388: The content of the paragraph was mentioned just before (l.358-363).

Authors apologize for this mistake. It was now modified in the revised version.

  • 408: endogenous activity. What is meant by this in this context and how does it differ from the only inoculum blank that is needed for every BMP test in order to determine the gas production by the inoculum?

Authors agree with this comment. It was clarified in the whole revised text. The only used expression is Blank for endogeneous activity of the inoculum and Positive control substrate for the determination of the methanogenic activity of the inoculum with the use of a standard (cellulose for example).

 

Reviewer 2 Report

The study presented is of high relevance for the research community since the standardisation/harmonisation of BMP tests requires a large contribution of different laboratories that use this method.

There are several aspects that can/should be improved:

  • The introduction is in parts formulated very detailed and might be shorted. A table summarizing the most important aspects of the different studies mentioned could help the reader to oversee the different initiatives on BMP standardisation and their main conclusions. The latest large interlaboratory studie (as mentioned in reference 54) should be described in the introduction already.
  • The language has to be revised (e.g. use of colloquial language as "...reproducibility calculated were unpleasantly of the..." l. 456, "...the criteria were bitterly and for a long time debated..." l. 399. Also some odd formulations are used: e.g. l.317 "...were asked to satisfy the following rules...")
  • The first six paragraphs of the discussion (l.496-l.551) are rather a summary of the results than a discussion. The own results should be set into context with other initiatives as mentioned in the introduction.

  • In l. 554 is the nature of the microbial consortium mentioned as a reason for the inter-laboratory viriability:

    In [54], a  international study with a large data set and inoculum exchange an impact of the inoculum was not observed, however errors in data processing are mentioned as a reason for variability, which has not been discussed in this manuscript.

  • A summary/conclusion section is missing.

Some further minor comments:

  • l. 135: ...past twenty years (twenty past years)...
  • l. 149-152: The sentence is unclear.
  • l. l297: "The test was therefore generally carried out in batch mode..." Isn't the BMP always determined in batch mode?
  • l. 356: "...the statistical analysis in order to particularly dismiss the terms with too small a workforce..." What is meant by that?
  • l.381-388: The content of the paragraph was mentioned just before (l.358-363).
  • l. 408: endogenous activity. What is meant by this in this context and how does it differ from the only inoculum blank that is needed for every BMP test in order to determine the gas production by the inoculum?

Author Response

Reviewers' comments:

Thank you for the thorough review of our manuscript. We would like to thank all the reviewers for their detailed comments which significantly contribute to improve the manuscript. All the reviewer´s comments were taking into account and the manuscript was corrected and adjusted accordingly.

Review report #2

Comments and Suggestions for Authors

The study presented is of high relevance for the research community since the standardisation/harmonisation of BMP tests requires a large contribution of different laboratories that use this method.

There are several aspects that can/should be improved:

  • The introduction is in parts formulated very detailed and might be shorted. A table summarizing the most important aspects of the different studies mentioned could help the reader to oversee the different initiatives on BMP standardisation and their main conclusions. The latest large interlaboratory studie (as mentioned in reference 54) should be described in the introduction already.

Authors agree with this comment. The introduction was shortened for some points. About the request of table summarizing the most important aspects of the different studies mentioned, it was done in recent papers such in Ohemeng-Ntiamoah and Datta, 2019; Filer et al., 2019; Raposo et al., 2020; Hafner et al., 2020 and we think that it’s not useful to reproduce it.

  • The language has to be revised (e.g. use of colloquial language as "...reproducibility calculated were unpleasantly of the..." l. 456, "...the criteria were bitterly and for a long time debated..." l. 399. Also some odd formulations are used: e.g. l.317 "...were asked to satisfy the following rules...")

Authors agree with this comment. Language is now revised.

  • The first six paragraphs of the discussion (l.496-l.551) are rather a summary of the results than a discussion. The own results should be set into context with other initiatives as mentioned in the introduction.

Authors agree with this comment. The text was now modified and shortened as requested in the revised version.

  • In l. 554 is the nature of the microbial consortium mentioned as a reason for the inter-laboratory viriability:

In [54], a  international study with a large data set and inoculum exchange an impact of the inoculum was not observed, however errors in data processing are mentioned as a reason for variability, which has not been discussed in this manuscript.

Authors agree with this comment. This is written in the revised version. The data processing is now detailed in Material and Methods section and we do not think that the observed variability can be explained by the curation of the data results.

  • A summary/conclusion section is missing.

Discussion section was now merged with Results section for a better understanding and as suggested by reviewers. The conclusion section was now added in the revised version.

Some further minor comments:

  1. 135: ...past twenty years (twenty past years)... It was modified in the revised version.
  2. 149-152: The sentence is unclear. It was modified in the revised version.
  • l297: "The test was therefore generally carried out in batch mode..." Isn't the BMP always determined in batch mode?

Authors fully agree with this comment. The corresponding paragraph is now removed in the revised version.

  • 356: "...the statistical analysis in order to particularly dismiss the terms with too small a workforce..." What is meant by that?

Authors agree with this comment and apologize for the confusion. The sentence is now modified in the revised version.

l.381-388: The content of the paragraph was mentioned just before (l.358-363).

Authors apologize for this mistake. It was now modified in the revised version.

  • 408: endogenous activity. What is meant by this in this context and how does it differ from the only inoculum blank that is needed for every BMP test in order to determine the gas production by the inoculum?

Authors agree with this comment. It was clarified in the whole revised text. The only used expression is Blank for endogeneous activity of the inoculum and Positive control substrate for the determination of the methanogenic activity of the inoculum with the use of a standard (cellulose for example).

 

Round 2

Reviewer 1 Report

R2: All comments from me (reviewer 1) for the second round are preceded by "R2:", so you can find all my responses by searching for R2 in your text editor. Most of the my comments are in response to your explanation/answers, but there are also some at the bottom the related to new topics. You have substantially improved the manuscript! Thank you for considering all my suggestions.

This manuscript describes an interesting interlaboratory study that adds to the limited available information on precision of BMP measurement. But the manuscript has several significant problems that need to be addressed.

Organization. What appear to be methods are described in the results section. Some sentences are repeated exactly.
2. Presentation of results. A single type of plot is used repeatedly, along with some simple tables. Some responses are discussed but the underlying data are not shown.
3. Discussion. It largely reads like an introduction, and does not include implications or limitations of the work.

These points are addressed in more detail below, where you can find other comments as well. Note that I have used "a -> b" to mean "change a to b" in some places.

Dear reviewer,

We would like to thank you for your comments and questions. We have considered all of them in order to present a well-organized results and discussion sections. Some responses to your questions have been added to the manuscript.

Title. Biological -> Biochemical (as in your text). Otherwise it is OK as is, although it is not very descriptive and "Results from a" does not add much. You might consider adding the terms "measurement" and "precision" somehow.

Authors agree with this comment. The authors provide this new title “Measurement of Biochemical Methane Potential of Heterogeneous Solid Substrates: Results of a Two Phases French Inter-laboratory Study” considering the valuable proposition of the reviewer

R2: Good except you need to change "two phases" to "two-phase" (as adjective) or somehow use ". . .study with two phases".

Abstract. Missing a conclusion or some connection between the background/motivation and your results. For example, it does not include the important result that a harmonized protocol did not improve precision.

Thanks for this comment. This abstract was modified.

R2: Good, but double-check language, e.g. "how various substrates to produce methane".

Throughout. Minor wording problems. Please closely check language. See examples:

Thanks for these comments. The changes were done in the revised version of the manuscript.
L 54. was -> is done.
L 68. "we" -- is this meant to be a personal observation? Done.
L 108 problematic -> problem (or question or topic) done.
L 135 "twenty years (twenty past years)" done.
L 382 few -> little done.
L 420 excess -> exceed done.
L 475 delete one "monohydrate" done.

R2: OK. But at least some of these were in fact not changed. See line 68 again.

L 106, 149 and perhaps elsewhere. Why is "inoculum" in italics?

Thanks for this comment. The manuscript was carefully checked and “inoculum” is now correctly written in the whole revised version.

R2: OK.

L 108-128. Is Mottet et al. (ref. 37) the correct citation? This paper seems to be on a different topic and I do not see the results you describe. Also, assuming the citation can be corrected, I wonder if all this detail is necessary, especially if it is given in the paper you will (ultimately) cite. Could it be written more concisely? Additionally, a recent paper covers this issue in some detail, and could be used here: Raposo et al. https://doi.org/10.1016/j.rser.2020.109890. Actually you cite it as ref. 13 but not in this part of the introduction. Please take a closer look and see if it is relevant for this part. And consider that the focus of this section is primarily on consistency or completeness reporting of BMP test methods--perhaps information on which test components influence BMP would be more relevant.

Authors agree with this comment and apologize for this mistake. The reference [37] Mottet et al., 2010 cited here is effectively not the correct citation for this part. This reference corresponds to the PhD thesis of Alexis Mottet entitled in French: “Alexis Mottet. Recherche d’indicateurs de biodégradabilité anaérobie et modélisation de la digestion anaérobie thermophile: Application aux boues secondaires d’épuration non traitées et prétraitées thermiquement. Sciences du Vivant [q-bio]. Université Montpellier 2 (Sciences et Techniques), 2009. Français.” This part was partially deleted and rearranged including the Ref 14 Raposo et al., 2020.

R2: Good.

L 122. "The quality of the measurement was evaluated with the use of blank. . . controls" Is this correct? It is my understanding that blanks are used simply to determine endogenous CH4 production. Also the "and/or" suggests that blanks and positive controls could be interchangeable, which is not the case.

Authors agree with this comment. The blanks with inoculum are effectively only used to determine the endogeneous methane production and positive controls are only used to determine inoculum activity and to validate the test. According to the previous modification, this part was now removed of the revised version.

R2: OK.

Introduction. In your review of interlaboratory studies you are missing another recent study here: Hafner et al. doi:10.3390/w12061752. Actually it is cite in the discussion as reference 54. Please take a closer look at the paper and see if there is information relevant to your introduction.

Authors agree with this valuable comment. This study was now also considered in the introduction section.

R2: OK.

L 196-204. These objectives are not completely consistent with the abstract, which suggests the objective was to test a harmonized protocol.

Authors agree with this comment. Abstract is completely modified as suggested by the reviewer.

R2: The abstract is indeed more clear and complete, but the sole objective listed in the abstract is not the same as those listed on lines 188+. Perhaps you could just add "harmonized BMP protocol was *developed* and tested" and that would aling them well enough.

L 250-252. Not clear enough. Were TS and VS measured by all labs? If so is this reproducibility you describe inter-laboratory? If so, shouldn't it be in the results section?

Authors agree with this valuable comment. This point is now included in results section and hopefully more clear. The TS and VS were effectively measured by all the labs when starting the experiments during the two phases. The TS and VS given in this part were from literature and were provided as information by the lab preparing the substrates sent to all participants. The TS and VS measurements carried out by all participants were given now in the Results section with a dedicated Table, including the theoretical BMP values given as theoretical targets. The reproducibility was given in this Table for each substrate and for TS and VS measurements.

R2: OK.


Table 1. Did you determine elemental composition and if so could you provide calculated maximum theoretical BMP? This would be interesting to have when evaluating the measured BMP values. Even without elemental composition, you could make an estimate based on nutritional information (perhaps including some reasonable guesses) as described in your ref. 53 https://doi.org/10.3390/w12051223, although the uncertainty may be too high.

Thanks for this comment. No elemental composition was determined but theoretical BMP were calculated as described in Appendix A and given in Table A1 to A4 and plotted in the Figures 4a, 4b, 4c and 7a, 7b, 7c showing the intra-laboratory reproducibility RSD.

R2: Good--nice addition.

L 260-261. Correct described order, first dried then ground. Note grinded -> ground. Also, do you have reference that shows no loss of volatiles at 80C?

Authors agree with this comment and apologize for this mistake. The order of preparation steps was changed. The loss of dry matter and volatile/organic compounds was described in some old studies dealing with the improvement of method for determination of dry matter in silage for feed. For example, Minson and Lancaster (1963) shown that oven drying at l00°C led to dry matter losses of up to 16 per cent depending on the quantity of organic acids present. At lower (40°C, 70°C) oven drying temperatures smaller losses occurred.

Similar studies shown same conclusion:

Fenner, H., Barnes, H.D. Improved method for determining dry matter in silage. J. Dairy Sci. 1965, 8(10):1324–1328.

McDonald, P., Dewarw, A., 1960. Determination of dry matter and volatiles in silages. J. Sci. FdAgric. 1960. 11, 566-57.

J. Minson, D.J., Lancaster, R.J. The effect of oven temperature on the error in estimating the dry matter content of silage, New Zealand Journal of Agricultural Research, 1963. 6:1-2, 140-146.


R2: But do any of these show there is no loss of volatiles at 80C? If so please cite it in the paper. If not please correct wording. You could correct the wording on line 413 with "avoiding" -> "to reduce".

L 266 and elsewhere. Consider "bucket" -> "bottle", depending on what was actually used.

Authors disagree with this proposition. For us, bucket is the right term, mayonnaise was purchased from food cash-discounter in buckets and straw was packed in closed buckets.

R2: OK.

L 270-271. Can you provide a reference for the mineral solution? What was it based on?

The mineral solution was prepared by one lab and sent to all participants. This mineral solution was prepared according to the recommendations of Angelidaki et al., 2004; Angelidaki et al., 2009 themselves adapted from Madigan et al., 2000. This last reference “Madigan MT, Marinko JM & Parker J (2000) Brock Biology of Microorganisms, 9th edn. Prentice Hall, NY” was now cited in the text and added in the list of references.

R2: OK. If it has the same composition as Algelidaki you do not need the table.

L 276-278. Was this done only by one laboratory? Clarify.

The information about TS and Vs was removed in this section and only the composition was now given here in Table 1. The TS and VS determination was carried out by all the participants. These analyses were carried out in triplicate in each lab for all substrates allowing to provide reproducibility values for all substrates, these data were now given in Table 3 in the Results section.

R2: OK.

Section 2.4. Did the individual labs calculate BMP for each individual replicate and then submit those data? Did you provide any guidelines on how the calculation should be carried out or was it up the labs (and assumed to be simple enough that information was not needed or does not need to be described)?

Authors agree with this comment. All details are now provided in the revised version. Data were collected with Excel files sent to each participant and for the two steps.

R2: The details are helpful thank you. Can you provide a citation for the calculations? Also consider including an example file (with the formulas) as supplementary material.

L 299-305. This discussion on limitations of kinetic information from BMP tests seems out of place. Move to introduction if you keep it. Consider whether it adds to the manuscript. Also, this topic is discussed in other work including a recent paper by Koch et al. https://doi.org/10.3389/fenrg.2020.00063.

Authors agree with this valuable comment. This discussion is now removed in the revised version of the manuscript.

R2: OK.

Figure 2. I don't think this is necessary. It really only shows that the second BMP test was started 4 weeks after the first. It is not clear what you mean by BMP1, BMP2, etc. Individual bottles/replicates? How about blanks and positive control samples?

Authors disagree with this comment and wish to keep it, but the Figure 2 was modified according to the comments of the reviewer concerning replicates, blanks (inoculum) and positive control samples.

R2: OK. Perhaps you could remind readers that this process was carried out twice (phases 1 and 2) in the caption.

Figure 3. Consider moving to results. Could you be more specific about manual methods, e.g., were they all manometric?

Authors agree with this comment. Figure 3 was moved to Results section. Manual methods were not all manometric, some of them were also volumetric (measurement of volume displacement). Quantification of automatic vs manual, and manometric vs volumetric were given now in Table 8 giving details and sample size of ANOVA with 4 factors for SA’, SB and SC.

R2: OK.

L 332-334. Was this done by each laboratory? Or did they report as-measured volume and you corrected values? If the latter, do you have reference for this calculation? This seems to be some variation in how it is carried out.

Authors agree with this comment. All details are now provided in the revised version. Data were collected with Excel files sent to each participant and for the two steps. All the values were directly corrected (temperature and pressure) in the Excel files.

R2: This could be made more clear in the text.

S 2.5. Consider briefly describing the calculations because they are not so clear in these ISO documents, which, furthermore, are not accessible to everyone. I believe these calculations are relatively simple, e.g., repeatability standard deviation was calculated as the substrate mean of standard deviation values calculated from 3 replicates for each lab x test combination, etc. But some more clarity could be good for readers without experience in this area or familiarity with the ISO standards. Also, some details are not clear. Is it correct that error from subtraction of endogenous production was not included in your precision estimates?

Authors agree with this comment. The calculations were now detailed in Material and Methods section.

R2: The new explanation and equations are helpful but still not quite clear, but could be addressed with some more details or perhaps a better citation. You have not really defined SSD_L. Presumably SSD_r was calculated from the difference between results from individual replicates and the mean for each particular lab. Is this correct? I don't mean to be too disagreeable on this topic but the ISO references you cite just do not describe these methods in detail. Is it possible you meant to cite ISO 5725-2 instead of ISO 5725-1? If so, perhaps these details are not needed (although my comment above still applies). Or have I overlooked the detailed descriptions in 5725-1 somehow? Or are there multiple versions of ISO 5725-1?

L 347. Can you be more specific about what you mean by "satisfactory".

Thanks for this comment. Repeatability and intra-reproducibility were now given with graphs showing the comparison between sets A and B for both phases. This point is now discussed in the revised version.

R2: Good.

L 348-353. Treatment of outliers is an important topic, as you imply. In the least, more details on how outliers were identified is needed, along with information on the number of outliers removed in the results section (I see around line 440). This information belongs in the methods section in my opinion. You might also consider repeating calculations with the suspected outliers included. It may not be correct that these outliers would never be provided to a customer.

Authors agree with this comment. The calculations were now detailed in Material and Methods section.

R2: OK.

Fig. 4 and similar plots. These have some problems:
* x axis label missing (presumably 1-11 are the different labs)
Thanks for this comment. Sorry for forgetting this. X axis label is now added in the corresponding figures in the revised version.

* It is not clear what results are shown here--BMP calculated by individual bottle (assay) (i.e., n = 6) and then summarized?

* What do different parts of plot show? Presumably mean, median, quartiles, extreme values, but please describe, perhaps in the methods section.

Authors agree with this comment. A description of boxplots and their different parts is now added in the revised version before the Figure 5.

R2: OK.

These plots show well variability within labs (e.g., by the size of each box). But if I understand correctly, they show what you referred to as repeatability and intra-laboratory reproducibility combined, which obscures information. Could you, instead, use different symbols for sets A and B, i.e., two boxes for each lab shown in different colors. They also provide some information on correlation among substrates within a lab, i.e., lab 4 shows low results for both SA, SA', but not SB. Finally, they show variability among labs (different boxes etc.). So they are useful figures but some results could be more clearly shown with different plots. These would include some statements in the results on the apparent effects (or lack of effects) of measurement method or other factors. Adding more information to the existing plots is an alternative for some results. For example, using different colors for the different measurement methods would allow readers to compare them. For others, new plots or tables would be helpful. For example, comparison of SA and SA' is probably best done using a scatterplot (SA BMP on one axis, SA' on another, points for each lab). Please consider how you can more clearly show the important results.

Authors agree with this comment. New figures (Fig 4a,b,c and Fig. 7a,b,c) were added that provided the results for the intra-laboratory reproducibility and a scatterplot that shown the correlation between SA and SA’.

R2: I like the new figures. But do you need both the bar charts and the box plots (e.g., both Figs. 4 and 5)? They seem to show the same information in slightly different ways.

L 354-357 and 441-445. Details on these precautions for the statistical analysis are needed.

Thanks for this comment. Details were now included in the Material and Methods section.

R2: OK.

L 356. workforce -> sample size?

Thanks for this comment. This word has been changed in revised version.

R2: OK.

L 360-362. "For other factors like the assessment method. . . " No ANOVA or other results are shown to support this statement. If you think this is a significant result, please support it by showing at least mean values and p values from hypothesis tests or similar information (i.e., standard error estimate for comparison), possibly along with a figure or table. As part of this information, please include the sample sizes for the different factor levels (so far this is only given for measurement type in Fig. 2). If you do not do not believe these relationships and others (see related comments below) are meaningful (after all, the sample size is quite small (n = 3 for AMPTS) and a repeated/hierarchical/nested structure as you pointed out L 354-357, in addition to this, factors levels were not randomly assigned) then why mention them at all? In that case simply state that there was no attempt to assess the impact of test factors on BMP and briefly explain why.

Authors agree with this comment. All the sample sizes were now provided in the revised version in dedicated tables for expected, removed and considered values for the statistical curation. Reasons of taking precaution for the statistical quantification of significance were also added in the Results section in revised version.

R2: OK

“Results for the statistical quantitation of significativity for the different factors obtained by ANOVA should be interpreted with caution. The measurements were taken from actual laboratory practices, therefore formed an incomplete and unbalanced experiment design, with no randomly assignment for the factor levels. The sample size for some factor levels, especially for the experimental system (Manual/AMPTS), was for some factors weak. Thereby, a certain number of precautions were applied during the statistical analysis in order to particularly dismiss the terms with too small a workforce, and to consider the nested nature of certain factors (method / gas measurement / agitation).”

please include the sample sizes for the different factor levels

Some new tables (Table 4 and Table 6) were now introduced with the details of the sample sizes as requested.

R2: OK.

L 362-363. If you conclude that there was no detectable effect for the effect of freezing etc. (SA vs. SA'), then support this with results from your statistical analysis. Include the mean difference and a p value at least. And did you use a paired test here, e.g., include "lab" as a factor? The design is clearly paired/"repeated measures" so you should take advantage of the higher power this provides.

Authors agree with this valuable comment. This part was modified in the revised version. Results of ANOVA with 2 factors and interaction are now given and show no difference between SA and SA’, considering the p-value of 0.563 obtained for the factor Sample.

No paired tests were carried out in our study as we understand the question.

R2: This is in lines 542+, correct? The description could be more clear. Instead of "sample", do you mean "substrate" or "treatment (dried vs. not dried)" or just "drying"? Also check the language in the paragraph--the bit on excluding the results from a single lab is particularly unclear. And it is surprising that a single pair of observations would have this effect (an increase in the p value by including the point makes more sense, but a decrease--strange). If you have included "lab" as a factor in your ANOVA, this is essentially a paired approach (equivalent to a paired t-test). But it does not make sense to have 50 values for SA and 53 for SA'--you should drop the unpaired values, right?

L 375. samples -> substrates?

The change have been carried out.

R2: Please check elsewhere also.

L 381-388. Show more details to support these results. See comment above for L 360-362. Additionally, some information is repeated here (SA vs. SA').

Please refer to our answer mentioned before for previous comment about L 360-362. Repetition are now deleted.

R2: OK.

Section 3.2. This material seems to be part of the methods, and I think it should be moved to that section.

One part could be moved to Methods but not the other part due to the fact that these consideration were decided from the discussion of the results obtained in the first phase.

R2: OK.

L 401-402. Why capitalize "Mineral" etc.?

Thanks for this comment. The modification has been carried out.

R2: OK.

L 403. Why capitalize "Carbonate"?

Thanks for this comment. The modification has been carried out.

R2: OK.

L 404-405. Clarify, so I:S could be 2 or 5 for some substrates, i.e., some labs used 2 and some used 5? Which substrate(s)?

Thanks for this comment. All labs used a I/S ratio of 2. This point was now clarified and the sentence is now modified in revised version. The proposition of using 5 is only for specific substrate (not in this study).

R2: OK.

L 409. Is "blank" an appropriate term for a positive control? To me "blank" implies no sample material.

Authors agree with this valuable comment and with this fact that “blank” implies no sample material. It was clarified and modified in the revised version of the manuscript. Blank was effectively not the appropriate term for a positive control and it’s a wrong use of term. The term positive control is effectively most appropriate.

R2: OK.

L 406-409. Does this mean only one bottle each was used for the negative and one for the positive control? Clarify. If so this is quite different from other recommendations and problematic. In particular this provides no means to estimate precision of the endogenous production. This issue requires some discussion.

Blanks and positive control substrate were achieved in triplicate. This point is now clarified in the revised version.

R2: OK.

L 414. Why use "CV" here when "RSD" is used elsewhere? Also applies to lines 434 and 436.

Authors agree with this comment and apologize for this mistake. CV is replaced by RSD in this part in the revised version.

R2: OK.

L 416-419. Considering that positive control results were discarded, does it make sense to keep this criterion?

Authors agree with this comment. The text is now modified.

R2: I do not see any change here. Please consider original comment.

L 423. excess -> exceed.

Thanks for this comment. The modification has been carried out.

R2: OK.

L 422-423. As I understand what you've written you mean the time at which the BMP itself increases by less than 1% in a day. Consider whether this is written clearly enough, especially because other, slightly different, criteria are in use.

Authors agree with this comment. The text was now modified according to the comment.

R2: I don't see any change in the text. Please check again. (And please be careful to avoid stating that a change was made when in fact it was not made--this erodes trust and makes reviewing difficult.)

L 414-423. Were these criteria evaluated by each participating lab or after submission by the organizers?

Thanks for this comment. This point is now clearly explained in Material and Methods section.

R2: OK.

L 424. How many tests were discarded by participating labs?

Thanks for this comment. The number of expected, discarded and considered values by participating labs are now clearly mentioned in Tables 4 and 6.

R2: OK. Perhaps "expected" is not the best term. This implies that all these tests were not necessarily carried out or submitted.

L 425-431. This text is a duplicate of text around line 348. Can you reorganize the text so you do not need to repeat it? As mentioned above, I think information on how outliers were identified and handled belongs in the methods.

Authors agree with this comment. The text is now reorganized as requested by the reviewer.

R2: OK.

L 439. Vague. Give more details.

Authors agree with this comment. The text is now clarified as requested by the reviewer.

R2: The text has in fact not been changed. See original comment. What do you mean by "technical justification"?

L 465-466. Unclear what this means. By "will be" do you mean in future work or below?

Authors agree with this comment. The effect of the inoculum is now better discussed in the revised manuscript.

R2: OK.

L 468-477. This discussion on suspected moisture absorption by sodium acetate, and the resulting poor quality of all positive control results, is not very convinving to me. Were results more variable than for the complex substrates? Was the acetate stored in open containers? Would moisture also be a problem for the complex substrates? Anyway, if you did not use the sodium acetate results at all, perhaps just mention the substrate (and the problems) earlier on and note that the data were not used and will not be presented. Then avoid repeating it.

Authors agree with this comment. Results were not more variable than for the complex substrate. Monohydrate sodium acetate is more hygroscopic than the dried and shredded substrate SA’. The acetate was purchased from one commercial lot, aliquoted in small boxes without desiccant then sent to all participants, and finally stored at room temperature. The hydratation of sodium acetate was just a hypothesis. As suggested, we removed it in the text.

R2: You have removed some of the information but not all, and the explanation is missing. Now the paper seems even more confusing with respect to this point. For example, there is still some text on it around line 729, and it is included in Fig. 2. Please try to address this point. See earlier recommendation.

L 478-482. Microcrystalline cellulose is a popular positive control and has been recommended. It would make sense to discuss it here.

Authors agree with this comment. This point is now discussed in the revised version.

R2: OK.

L 484-494. Show these results. In the least, mean differences (at least one is given for the AMPTS result) and p values would be helpful. Plots and tables might be useful as well.

Authors agree with this comment. The results of ANOVA with 4 factors carried out for the 3 substrates are now included in this part, showing p-values < 0.005 for the variable Method (automatic vs manual) and a discussion about the significance of such results obtained from a low statistical size is also added.

R2: The figures are particularly helpful.

Discussion. Much of this repeats information given elsewhere. In places it seems more like an introduction, and an extended abstract elsewhere. Consider removing the repeated parts and much of the review. Here is what Water gives in the guide for authors:

"Discussion: Authors should discuss the results and how they can be interpreted in perspective of previous studies and of the working hypotheses. The findings and their implications should be discussed in the broadest context possible and limitations of the work highlighted. Future research directions may also be mentioned. This section may be combined with Results."

Authors agree with this valuable comment. Discussion is now merged with Results as recommended by the reviewer.

R2: OK.

In your discussion text, connections to the introduction and objectives could be more clear. For example, you stress earlier in the paper that this study uniquely measured intra-laboratory reproducibility but what is, ultimately, the significance of the estimates? A discussion of limitations and what is needed in future work would be useful also. I recommend discussing issues related to positive controls and lack of replication of the blanks (if I have understood your description correctly). Additionally, the limitations of protocol standardization/harmonization alone and a possible need for effective validation criteria seems relevant here. Please revise. Also consider whether a combined results and discussion section would be more effective.

Authors agree with this comment. Results and Discussion are now merged in the revised version, hoping that the understanding is improved for the reader.

R2: OK.

L 552-559. This seems more like a typical discussion. But I don't think it is true that the microbial consortium (i.e., the inoculum) is the only part of the tests that differ among labs. There could be bias in the various measurement methods used, or the particular way each laboratory applies them. Some assessment of overall measurement bias for laboratories (e.g., the bivariate plots mentioned above would more clearly show whether labs tend to be high/low for all substrates or whether their response varies among substrates) could be helpful here. Some more discussion on studies that assessed inoculum effects on BMP could be useful here. Consider these papers: https://doi.org/10.1016/j.biortech.2017.06.142, https://doi.org/10.1111/1751-7915.12268, https://doi.org/10.3390/app10072589, https://doi.org/10.1016/j.biortech.2013.07.051, https://doi.org/10.1016/j.biortech.2012.01.025, https://doi.org/10.3390/w12061752 .

Authors agree with this comment. The text was modified in the revised version and some references proposed by the reviewer and some others were added.

R2: OK.

L 560-563. Do you think this drying conclusion is a general one, that would apply to other substrates? Has it been assessed in other studies, with other substrates?

We do not know if that would be apply to other substrates, probably yes, with exception for substrate that contains volatile compounds such VFA for example.

R2: OK.

R2: OTHER (NEW in R2) COMMENTS BELOW
R2: L 38 "More" -> "Moreover"?

R2: L 39 "more or . . . BMP" -> "about 15% of all observations" (or observations -> records)

R2: L 42 "is low" -> "was low"

R2: L 54-55. "alone or" -> "alone or through"

R2: L 55-56 "mainly ... methane" -> "mainly composed of methane and carbon dioxide"

R2: L 274. Delete "each".

R2: L 199-201. Delete (redundant)

R2: L 206. You state that the majority of labs used a closer bioreactor (batch test). That implies some did not--is that true? Can you clarify?

R2: L 221. "pretreatment" implies e.g., hydrothermal treatment or grinding. Consider changing wording.

R2: L 223-225. Consider moving this info on gas volume standardization to a calculations section.

R2: Fig. 1. Can you use the same terms/phrase for steps 3 and 4?

R2: Fig. 2. Consider removing positive control portion.

R2: L 289-290. Name of files is irrelevant. Remove.

R2: L 302. You wrote "substrate" but do you mean "inoculum"?

R2: L 317. Delete second "also".

R2: L 323. Can you provide location/city for Ondalys?

R2: L 339. "for some factors weak" -> "small".

R2: L 372. "kept for" -> "removed from"?

R2: L 377-380. Not clear if organizers or individual participating labs removed these observations.

R2: L 385. "deposits" is not clear.

R2: L 396. "industrial" not clear.

R2: L 403. "was" -> "is"

R2: L 438-441. Calculation of theoretical BMP belongs in methods.

R2: Table 3. Consider moving theoretical BMP to Table 1. Seems out of place here in Table 3.

R2: Fig. 3. Consider giving counts (number of observations or labs) instead of percentages in the printed numbers (e.g., 31% -> 4).

R2: Fig. 3. On line 216 you state that some lab(s) used gravimetric methods but that is not shown here.

R2: Fig. 4 and elsewhere. "Labnumber" -> "Lab number" or "Lab ID" or "Lab key" or "Lab code". . . (space is important part)

R2: Fig. 4. Figure caption does not really explain what is shown. Can you add "Summary of BMP values measured . . . showing" and then you can include you "intra-laboratory. . ."?

R2: Fig. 4. Are data missing? Fig. 5 shows results for lab 11 SA but these are not shown in Fig. 4. Please double-check.

R2: L 534. 6% -> 7%?

R2: Fig. 6. Caption needs more information. Are these means values plus standard deviation (bars) by lab from phase 1? Why are only 9 labs shown? Could you indicate which point shows the single value that influenced the ANOVA results?

R2: L 560. State apparent nutrient solution effect (e.g., mean difference as % with p value).

R2: L 596. This implies that pH always has to be measured in all bottles. Is that correct?

R2: L 602. "the third" -> "one-third"

R2: L 685. Wording is not clear. Seems to suggest that Hafner et al. made some statement about your Table 3, which does not make sense.

R2: L 712-718, L 809-828, possibly elsewhere. This is a major problem that must be corrected. Some of the language seems different in these sections than it does in the rest of your paper. A Google search of some of the text surrounded by quotes, e.g. "In contrast, a laboratory effect was clear from the ANOVA, and it was much larger than the mean inoculum" shows that at least some of the text is identical to text in ref. 46, while other sentences are nearly identical. Please check the complete manuscript for other cases where this may be a problem. These sections need to be rewritten (not simply changed by modifying one or two words in each sentence).

R2: L 747. Do these two papers really give identical validation criteria? I think they are in fact different. Please double-check.

R2: L 786. Also consider that the measurement methods etc. were not randomly assigned.

R2: Fig. 9. It would be helpful to have the sample size (n = number of labs) included as text in these plots.

R2: L 790 and maybe elsewhere. Instead of "influence of x on y" you might use "correlation between x and y". This is more appropriate because treatments were not randomly assigned.

R2: L 835. Consider deleting "common".

R2: Appendix A. I expect that this material should go in an online supplement, but I expect the editors or publication office will tell you to make this change.

Author Response

Reviewers' comments:

In the following text, we decide to highlight the new comments of the Reviewer #1 in red and our answers in blue in order to facilitate the readability of the document, because the Review report Form was based on the previous review report form. We decide to keep all the text and remarks for the traceability and considering that allows to remember the discussion for some points or details.

We hope that our answer and the modification provided in version R2 will satisfy the reviewer. We work hard, double checking the whole document and trying to provide the most adapted answers, in order to improve the quality of our paper and we are very grateful to the reviewers and you as editor to give us the opportunity to greatly improve our manuscript.

Review report Form #1

Review Report Form

Open Review

(x) I would not like to sign my review report
( ) I would like to sign my review report

English language and style

( ) Extensive editing of English language and style required
(x) Moderate English changes required
( ) English language and style are fine/minor spell check required
( ) I don't feel qualified to judge about the English language and style

 

 

 

Yes

Can be improved

Must be improved

Not applicable

Does the introduction provide sufficient background and include all relevant references?

(x)

( )

( )

( )

Is the research design appropriate?

(x)

( )

( )

( )

Are the methods adequately described?

(x)

( )

( )

( )

Are the results clearly presented?

( )

(x)

( )

( )

Are the conclusions supported by the results?

(x)

( )

( )

( )

Comments and Suggestions for Authors

R2: All comments from me (reviewer 1) for the second round are preceded by "R2:", so you can find all my responses by searching for R2 in your text editor. Most of the my comments are in response to your explanation/answers, but there are also some at the bottom the related to new topics. You have substantially improved the manuscript! Thank you for considering all my suggestions.

 

Thank you for the thorough review of our manuscript. We would like to thank all the reviewers for their detailed comments which significantly contribute to improve the manuscript. All the reviewer´s comments were taking into account and the manuscript was corrected and adjusted accordingly.

This manuscript describes an interesting interlaboratory study that adds to the limited available information on precision of BMP measurement. But the manuscript has several significant problems that need to be addressed.

 

Organization. What appear to be methods are described in the results section. Some sentences are repeated exactly.
2. Presentation of results. A single type of plot is used repeatedly, along with some simple tables. Some responses are discussed but the underlying data are not shown.
3. Discussion. It largely reads like an introduction, and does not include implications or limitations of the work.

 

These points are addressed in more detail below, where you can find other comments as well. Note that I have used "a -> b" to mean "change a to b" in some places.

 

Dear reviewer,

We would like to thank you for your comments and questions. We have considered all of them in order to present a well-organized results and discussion sections. Some responses to your questions have been added to the manuscript.

 

Title. Biological -> Biochemical (as in your text). Otherwise it is OK as is, although it is not very descriptive and "Results from a" does not add much. You might consider adding the terms "measurement" and "precision" somehow.

 

Authors agree with this comment. The authors provide this new title “Measurement of Biochemical Methane Potential of Heterogeneous Solid Substrates: Results of a Two Phases French Inter-laboratory Study” considering the valuable proposition of the reviewer

 

R2: Good except you need to change "two phases" to "two-phase" (as adjective) or somehow use ". . .study with two phases".

Thanks for this comment. The title is now modified according to this comment as following: “Measurement of Biochemical Methane Potential of Heterogeneous Solid Substrates: Results of a Two-Phase French Inter-laboratory Study”.

 

Abstract. Missing a conclusion or some connection between the background/motivation and your results. For example, it does not include the important result that a harmonized protocol did not improve precision.

 

Thanks for this comment. This abstract was modified.

 

R2: Good, but double-check language, e.g. "how various substrates to produce methane".

Thanks for this comment. The sentence is now modified as following in the revised version “Biochemical methane potential (BMP) is essential to determine the production of methane for various substrates ;…”

 

Throughout. Minor wording problems. Please closely check language. See examples:

 

Thanks for these comments. The changes were done in the revised version of the manuscript.
L 54. was -> is done.
L 68. "we" -- is this meant to be a personal observation? Done.
L 108 problematic -> problem (or question or topic) done.
L 135 "twenty years (twenty past years)" done.
L 382 few -> little done.
L 420 excess -> exceed done.
L 475 delete one "monohydrate" done.

 

R2: OK. But at least some of these were in fact not changed. See line 68 again.

We apologize for this inconvenience and this oversight. The sentence is now modified as following: “Consequently, an increase in the number of projects for solid waste treatment by anaerobic digestion was observed.”

 

L 106, 149 and perhaps elsewhere. Why is "inoculum" in italics?

 

Thanks for this comment. The manuscript was carefully checked and “inoculum” is now correctly written in the whole revised version.

 

R2: OK.

Thank you.

 

L 108-128. Is Mottet et al. (ref. 37) the correct citation? This paper seems to be on a different topic and I do not see the results you describe. Also, assuming the citation can be corrected, I wonder if all this detail is necessary, especially if it is given in the paper you will (ultimately) cite. Could it be written more concisely? Additionally, a recent paper covers this issue in some detail, and could be used here: Raposo et al. https://doi.org/10.1016/j.rser.2020.109890. Actually you cite it as ref. 13 but not in this part of the introduction. Please take a closer look and see if it is relevant for this part. And consider that the focus of this section is primarily on consistency or completeness reporting of BMP test methods--perhaps information on which test components influence BMP would be more relevant.

 

Authors agree with this comment and apologize for this mistake. The reference [37] Mottet et al., 2010 cited here is effectively not the correct citation for this part. This reference corresponds to the PhD thesis of Alexis Mottet entitled in French: “Alexis Mottet. Recherche d’indicateurs de biodégradabilité anaérobie et modélisation de la digestion anaérobie thermophile: Application aux boues secondaires d’épuration non traitées et prétraitées thermiquement. Sciences du Vivant [q-bio]. Université Montpellier 2 (Sciences et Techniques), 2009. Français.” This part was partially deleted and rearranged including the Ref 14 Raposo et al., 2020.

 

R2: Good.

Thanks for this comment.

 

L 122. "The quality of the measurement was evaluated with the use of blank. . . controls" Is this correct? It is my understanding that blanks are used simply to determine endogenous CH4 production. Also the "and/or" suggests that blanks and positive controls could be interchangeable, which is not the case.

 

Authors agree with this comment. The blanks with inoculum are effectively only used to determine the endogeneous methane production and positive controls are only used to determine inoculum activity and to validate the test. According to the previous modification, this part was now removed of the revised version.

 

R2: OK.

Thank you.

 

Introduction. In your review of interlaboratory studies you are missing another recent study here: Hafner et al. doi:10.3390/w12061752. Actually it is cite in the discussion as reference 54. Please take a closer look at the paper and see if there is information relevant to your introduction.

 

Authors agree with this valuable comment. This study was now also considered in the introduction section.

 

R2: OK.

Thank you.

 

L 196-204. These objectives are not completely consistent with the abstract, which suggests the objective was to test a harmonized protocol.

 

Authors agree with this comment. Abstract is completely modified as suggested by the reviewer.

 

R2: The abstract is indeed more clear and complete, but the sole objective listed in the abstract is not the same as those listed on lines 188+. Perhaps you could just add "harmonized BMP protocol was *developed* and tested" and that would aling them well enough.

Thanks for this comment. The sentence is now modified in the abstract according to this comment as following: “In this paper, a harmonized BMP protocol was developed and tested with two phases of BMP tests carried out by eleven French laboratories.”

 

L 250-252. Not clear enough. Were TS and VS measured by all labs? If so is this reproducibility you describe inter-laboratory? If so, shouldn't it be in the results section?

 

Authors agree with this valuable comment. This point is now included in results section and hopefully more clear. The TS and VS were effectively measured by all the labs when starting the experiments during the two phases. The TS and VS given in this part were from literature and were provided as information by the lab preparing the substrates sent to all participants. The TS and VS measurements carried out by all participants were given now in the Results section with a dedicated Table, including the theoretical BMP values given as theoretical targets. The reproducibility was given in this Table for each substrate and for TS and VS measurements.

 

R2: OK.

Thank you.


Table 1. Did you determine elemental composition and if so could you provide calculated maximum theoretical BMP? This would be interesting to have when evaluating the measured BMP values. Even without elemental composition, you could make an estimate based on nutritional information (perhaps including some reasonable guesses) as described in your ref. 53 https://doi.org/10.3390/w12051223, although the uncertainty may be too high.

 

Thanks for this comment. No elemental composition was determined but theoretical BMP were calculated as described in Appendix A and given in Table A1 to A4 and plotted in the Figures 4a, 4b, 4c and 7a, 7b, 7c showing the intra-laboratory reproducibility RSD.

 

R2: Good--nice addition.

Thanks for this comment.

 

L 260-261. Correct described order, first dried then ground. Note grinded -> ground. Also, do you have reference that shows no loss of volatiles at 80C?

 

Authors agree with this comment and apologize for this mistake. The order of preparation steps was changed. The loss of dry matter and volatile/organic compounds was described in some old studies dealing with the improvement of method for determination of dry matter in silage for feed. For example, Minson and Lancaster (1963) shown that oven drying at l00°C led to dry matter losses of up to 16 per cent depending on the quantity of organic acids present. At lower (40°C, 70°C) oven drying temperatures smaller losses occurred.

Similar studies shown same conclusion:

Fenner, H., Barnes, H.D. Improved method for determining dry matter in silage. J. Dairy Sci. 1965, 8(10):1324–1328.

McDonald, P., Dewar, W.A., 1960. Determination of dry matter and volatiles in silages. J. Sci. FdAgric. 1960. 11, 566-57.

  1. Minson, D.J., Lancaster, R.J. The effect of oven temperature on the error in estimating the dry matter content of silage, New Zealand Journal of Agricultural Research, 1963. 6:1-2, 140-146.


R2: But do any of these show there is no loss of volatiles at 80C? If so please cite it in the paper. If not please correct wording. You could correct the wording on line 413 with "avoiding" -> "to reduce".

We agree with this comment. The change was made in the revised version according to this comment as following: “For the preparation of the substrate SA’ (homogeneous powder), 2 lots were dried at 80°C to reduce the potential loss of volatile fatty acids (VFA) and ground to 1 cm (Blick BB 230).”

 

L 266 and elsewhere. Consider "bucket" -> "bottle", depending on what was actually used.

 

Authors disagree with this proposition. For us, bucket is the right term, mayonnaise was purchased from food cash-discounter in buckets and straw was packed in closed buckets.

 

R2: OK.

Thank you.

 

L 270-271. Can you provide a reference for the mineral solution? What was it based on?

 

The mineral solution was prepared by one lab and sent to all participants. This mineral solution was prepared according to the recommendations of Angelidaki et al., 2004; Angelidaki et al., 2009 themselves adapted from Madigan et al., 2000. This last reference “Madigan MT, Marinko JM & Parker J (2000) Brock Biology of Microorganisms, 9th edn. Prentice Hall, NY” was now cited in the text and added in the list of references.

 

R2: OK. If it has the same composition as Algelidaki you do not need the table.

The composition is not the same but adapted from Angelidaki.

 

L 276-278. Was this done only by one laboratory? Clarify.

 

The information about TS and Vs was removed in this section and only the composition was now given here in Table 1. The TS and VS determination was carried out by all the participants. These analyses were carried out in triplicate in each lab for all substrates allowing to provide reproducibility values for all substrates, these data were now given in Table 3 in the Results section.

 

R2: OK.

Thank you.

 

Section 2.4. Did the individual labs calculate BMP for each individual replicate and then submit those data? Did you provide any guidelines on how the calculation should be carried out or was it up the labs (and assumed to be simple enough that information was not needed or does not need to be described)?

 

Authors agree with this comment. All details are now provided in the revised version. Data were collected with Excel files sent to each participant and for the two steps.

 

R2: The details are helpful thank you. Can you provide a citation for the calculations? Also consider including an example file (with the formulas) as supplementary material.

No specific citation was inserted in the text, but to be comprehensive we prefer to add all the calculation details in a new section called “2.2.4. Experimental data calculation”. For the lab participants, all the formulas were included in the Excel files and the individual labs hadn’t to carry out any calculation.

 

L 299-305. This discussion on limitations of kinetic information from BMP tests seems out of place. Move to introduction if you keep it. Consider whether it adds to the manuscript. Also, this topic is discussed in other work including a recent paper by Koch et al. https://doi.org/10.3389/fenrg.2020.00063.

 

Authors agree with this valuable comment. This discussion is now removed in the revised version of the manuscript.

 

R2: OK.

Thank you.

 

Figure 2. I don't think this is necessary. It really only shows that the second BMP test was started 4 weeks after the first. It is not clear what you mean by BMP1, BMP2, etc. Individual bottles/replicates? How about blanks and positive control samples?

 

Authors disagree with this comment and wish to keep it, but the Figure 2 was modified according to the comments of the reviewer concerning replicates, blanks (inoculum) and positive control samples.

 

R2: OK. Perhaps you could remind readers that this process was carried out twice (phases 1 and 2) in the caption.

Thanks for this comment. The caption was modified as following: “Schematic and organizational schedule of the inter-laboratory study; this process was carried out twice (phases 1 and 2).”  

 

Figure 3. Consider moving to results. Could you be more specific about manual methods, e.g., were they all manometric?

 

Authors agree with this comment. Figure 3 was moved to Results section. Manual methods were not all manometric, some of them were also volumetric (measurement of volume displacement). Quantification of automatic vs manual, and manometric vs volumetric were given now in Table 8 giving details and sample size of ANOVA with 4 factors for SA’, SB and SC.

 

R2: OK.

Thank you.

 

L 332-334. Was this done by each laboratory? Or did they report as-measured volume and you corrected values? If the latter, do you have reference for this calculation? This seems to be some variation in how it is carried out.

 

Authors agree with this comment. All details are now provided in the revised version. Data were collected with Excel files sent to each participant and for the two steps. All the values were directly corrected (temperature and pressure) in the Excel files.

 

R2: This could be made more clear in the text.

The new section 2.2.4. Experimental data calculation is now provided in the revised version and give all the details of the calculations.

 

S 2.5. Consider briefly describing the calculations because they are not so clear in these ISO documents, which, furthermore, are not accessible to everyone. I believe these calculations are relatively simple, e.g., repeatability standard deviation was calculated as the substrate mean of standard deviation values calculated from 3 replicates for each lab x test combination, etc. But some more clarity could be good for readers without experience in this area or familiarity with the ISO standards. Also, some details are not clear. Is it correct that error from subtraction of endogenous production was not included in your precision estimates?

 

Authors agree with this comment. The calculations were now detailed in Material and Methods section.

 

R2: The new explanation and equations are helpful but still not quite clear, but could be addressed with some more details or perhaps a better citation. You have not really defined SSD_L. Presumably SSD_r was calculated from the difference between results from individual replicates and the mean for each particular lab. Is this correct? I don't mean to be too disagreeable on this topic but the ISO references you cite just do not describe these methods in detail. Is it possible you meant to cite ISO 5725-2 instead of ISO 5725-1? If so, perhaps these details are not needed (although my comment above still applies). Or have I overlooked the detailed descriptions in 5725-1 somehow? Or are there multiple versions of ISO 5725-1?

 

Thanks for this valuable comment. The formulas were now modified and better defined in the revised version. We hope that this modification will clarify this part.

Concerning the reference ISO 5725, we agree with your comment. Looking about the details requested, we understand that the information is dispatched in several parts of ISO 5725-1 to 5725-6 and we wrongly cited only one part with ISO 5725-1. We just will refer to ISO 5725 globally. We modify this citation in the text of the revised version and in the list of references.

 

L 347. Can you be more specific about what you mean by "satisfactory".

 

Thanks for this comment. Repeatability and intra-reproducibility were now given with graphs showing the comparison between sets A and B for both phases. This point is now discussed in the revised version.

 

R2: Good.

Thanks for this comment.

 

L 348-353. Treatment of outliers is an important topic, as you imply. In the least, more details on how outliers were identified is needed, along with information on the number of outliers removed in the results section (I see around line 440). This information belongs in the methods section in my opinion. You might also consider repeating calculations with the suspected outliers included. It may not be correct that these outliers would never be provided to a customer.

 

Authors agree with this comment. The calculations were now detailed in Material and Methods section.

 

R2: OK.

Thank you.

 

Fig. 4 and similar plots. These have some problems:
* x axis label missing (presumably 1-11 are the different labs)


Thanks for this comment. Sorry for forgetting this. X axis label is now added in the corresponding figures in the revised version.

 

* It is not clear what results are shown here--BMP calculated by individual bottle (assay) (i.e., n = 6) and then summarized?

* What do different parts of plot show? Presumably mean, median, quartiles, extreme values, but please describe, perhaps in the methods section.

 

Authors agree with this comment. A description of boxplots and their different parts is now added in the revised version before the Figure 5.

 

R2: OK.

Thank you.

 

These plots show well variability within labs (e.g., by the size of each box). But if I understand correctly, they show what you referred to as repeatability and intra-laboratory reproducibility combined, which obscures information. Could you, instead, use different symbols for sets A and B, i.e., two boxes for each lab shown in different colors. They also provide some information on correlation among substrates within a lab, i.e., lab 4 shows low results for both SA, SA', but not SB. Finally, they show variability among labs (different boxes etc.). So they are useful figures but some results could be more clearly shown with different plots. These would include some statements in the results on the apparent effects (or lack of effects) of measurement method or other factors. Adding more information to the existing plots is an alternative for some results. For example, using different colors for the different measurement methods would allow readers to compare them. For others, new plots or tables would be helpful. For example, comparison of SA and SA' is probably best done using a scatterplot (SA BMP on one axis, SA' on another, points for each lab). Please consider how you can more clearly show the important results.

 

Authors agree with this comment. New figures (Fig 4a,b,c and Fig. 7a,b,c) were added that provided the results for the intra-laboratory reproducibility and a scatterplot that shown the correlation between SA and SA’.

 

R2: I like the new figures. But do you need both the bar charts and the box plots (e.g., both Figs. 4 and 5)? They seem to show the same information in slightly different ways.

Authors disagree with this comment, the bar charts described the repeatability and the intra-laboratory results (averages and standard errors) and the box-plots gave the inter-laboratory comparisons (mean, max, min, variability). From our point of view, this information is complementary.

 

L 354-357 and 441-445. Details on these precautions for the statistical analysis are needed.

 

Thanks for this comment. Details were now included in the Material and Methods section.

 

R2: OK.

Thank you.

 

L 356. workforce -> sample size?

 

Thanks for this comment. This word has been changed in revised version.

 

R2: OK.

Thank you.

 

L 360-362. "For other factors like the assessment method. . . " No ANOVA or other results are shown to support this statement. If you think this is a significant result, please support it by showing at least mean values and p values from hypothesis tests or similar information (i.e., standard error estimate for comparison), possibly along with a figure or table. As part of this information, please include the sample sizes for the different factor levels (so far this is only given for measurement type in Fig. 2). If you do not do not believe these relationships and others (see related comments below) are meaningful (after all, the sample size is quite small (n = 3 for AMPTS) and a repeated/hierarchical/nested structure as you pointed out L 354-357, in addition to this, factors levels were not randomly assigned) then why mention them at all? In that case simply state that there was no attempt to assess the impact of test factors on BMP and briefly explain why.

 

Authors agree with this comment. All the sample sizes were now provided in the revised version in dedicated tables for expected, removed and considered values for the statistical curation. Reasons of taking precaution for the statistical quantification of significance were also added in the Results section in revised version.

 

R2: OK

Thank you.

 

“Results for the statistical quantitation of significativity for the different factors obtained by ANOVA should be interpreted with caution. The measurements were taken from actual laboratory practices, therefore formed an incomplete and unbalanced experiment design, with no randomly assignment for the factor levels. The sample size for some factor levels, especially for the experimental system (Manual/AMPTS), was for some factors weak. Thereby, a certain number of precautions were applied during the statistical analysis in order to particularly dismiss the terms with too small a workforce, and to consider the nested nature of certain factors (method / gas measurement / agitation).”

 

please include the sample sizes for the different factor levels

 

Some new tables (Table 4 and Table 6) were now introduced with the details of the sample sizes as requested.

 

R2: OK.

Thank you.

 

L 362-363. If you conclude that there was no detectable effect for the effect of freezing etc. (SA vs. SA'), then support this with results from your statistical analysis. Include the mean difference and a p value at least. And did you use a paired test here, e.g., include "lab" as a factor? The design is clearly paired/"repeated measures" so you should take advantage of the higher power this provides.

Authors agree with this valuable comment. This part was modified in the revised version. Results of ANOVA with 2 factors and interaction are now given and show no difference between SA and SA’, considering the p-value of 0.563 obtained for the factor Sample.

 

No paired tests were carried out in our study as we understand the question.

 

R2: This is in lines 542+, correct? The description could be more clear. Instead of "sample", do you mean "substrate" or "treatment (dried vs. not dried)" or just "drying"? Also check the language in the paragraph--the bit on excluding the results from a single lab is particularly unclear. And it is surprising that a single pair of observations would have this effect (an increase in the p value by including the point makes more sense, but a decrease--strange). If you have included "lab" as a factor in your ANOVA, this is essentially a paired approach (equivalent to a paired t-test). But it does not make sense to have 50 values for SA and 53 for SA'--you should drop the unpaired values, right?

 

We apologize for this misunderstanding, we agree that we used an ANOVA including the factor Lab, it’s a paired test approach. The text was modified with the change of Sample factor for Substrate factor. The exclusion of a single lab was already mentioned in the revised version R1. We double checked the results and we confirmed that a single lab modified thoroughly the result.

Concerning the comment about the unpaired value for SA vs SA’, the design is actually slightly unbalanced but we think that it does not have an effect. Since we think the design is almost balanced, we did not remove the unpaired values.

 

L 375. samples -> substrates?

 

The change has been carried out.

 

R2: Please check elsewhere also.

The authors apologize for this mistake. We double checked the whole manuscript. The change has been carried out carefully in the whole manuscript.

 

L 381-388. Show more details to support these results. See comment above for L 360-362. Additionally, some information is repeated here (SA vs. SA').

 

Please refer to our answer mentioned before for previous comment about L 360-362. Repetition are now deleted.

 

R2: OK.

Thank you.

 

Section 3.2. This material seems to be part of the methods, and I think it should be moved to that section.

 

One part could be moved to Methods but not the other part due to the fact that these consideration were decided from the discussion of the results obtained in the first phase.

 

R2: OK.

Thank you.

 

L 401-402. Why capitalize "Mineral" etc.?

 

Thanks for this comment. The modification has been carried out.

 

R2: OK.

Thank you.

 

L 403. Why capitalize "Carbonate"?

 

Thanks for this comment. The modification has been carried out.

 

R2: OK.

Thank you.

 

L 404-405. Clarify, so I:S could be 2 or 5 for some substrates, i.e., some labs used 2 and some used 5? Which substrate(s)?

 

Thanks for this comment. All labs used a I/S ratio of 2. This point was now clarified and the sentence is now modified in revised version. The proposition of using 5 is only for specific substrate (not in this study).

 

R2: OK.

Thank you.

 

L 409. Is "blank" an appropriate term for a positive control? To me "blank" implies no sample material.

 

Authors agree with this valuable comment and with this fact that “blank” implies no sample material. It was clarified and modified in the revised version of the manuscript. Blank was effectively not the appropriate term for a positive control and it’s a wrong use of term. The term positive control is effectively most appropriate.

 

R2: OK.

Thank you.

 

L 406-409. Does this mean only one bottle each was used for the negative and one for the positive control? Clarify. If so this is quite different from other recommendations and problematic. In particular this provides no means to estimate precision of the endogenous production. This issue requires some discussion.

 

Blanks and positive control substrate were achieved in triplicate. This point is now clarified in the revised version.

 

R2: OK.

Thank you.

 

L 414. Why use "CV" here when "RSD" is used elsewhere? Also applies to lines 434 and 436.

 

Authors agree with this comment and apologize for this mistake. CV is replaced by RSD in this part in the revised version.

 

R2: OK

Thank you.

 

L 416-419. Considering that positive control results were discarded, does it make sense to keep this criterion?

 

Authors agree with this comment. The text is now modified.

 

R2: I do not see any change here. Please consider original comment.

We apologize for this misunderstanding and this mistake in our answer, it’s an error from us, we didn’t modify this point because we want to keep this criterion.

 

L 423. excess -> exceed.

 

Thanks for this comment. The modification has been carried out.

 

R2: OK.

Thank you.

 

L 422-423. As I understand what you've written you mean the time at which the BMP itself increases by less than 1% in a day. Consider whether this is written clearly enough, especially because other, slightly different, criteria are in use.

 

Authors agree with this comment. The text was now modified according to the comment.

 

R2: I don't see any change in the text. Please check again. (And please be careful to avoid stating that a change was made when in fact it was not made--this erodes trust and makes reviewing difficult.)

We understand and we apologize about this fact. For us, in the previous revised version, we removed this point of the subsection concerning validation criteria and we created a specific bullet-point dedicated to the end of BMP criteria (the sentence concerning the end of BMP criteria was excluded of the validation criteria). You are right, we didn’t modify the sentence but we modified the paragraph to answer to your first comment and remark. The rearrangement of this part was maybe not sufficiently explained in our answer. We are sorry for the inconvenience caused by our vagueness.

 

L 414-423. Were these criteria evaluated by each participating lab or after submission by the organizers?

 

Thanks for this comment. This point is now clearly explained in Material and Methods section.

 

R2: OK.

Thank you.

 

L 424. How many tests were discarded by participating labs?

 

Thanks for this comment. The number of expected, discarded and considered values by participating labs are now clearly mentioned in Tables 4 and 6.

 

R2: OK. Perhaps "expected" is not the best term. This implies that all these tests were not necessarily carried out or submitted.

For us, “expected” seems to be suitable because this is the maximal number of the results that was expected if all labs didn’t remove outliers or abnormal results.

 

L 425-431. This text is a duplicate of text around line 348. Can you reorganize the text so you do not need to repeat it? As mentioned above, I think information on how outliers were identified and handled belongs in the methods.

 

Authors agree with this comment. The text is now reorganized as requested by the reviewer.

 

R2: OK.

Thank you.

 

L 439. Vague. Give more details.

 

Authors agree with this comment. The text is now clarified as requested by the reviewer.

 

R2: The text has in fact not been changed. See original comment. What do you mean by "technical justification"?

For us, we considered as false that the clarification with rearrangement of the text was sufficient (the part of text concerning the removal of outliers was moved from Results section to Materials and Methods section (subsection 2.2.4 results analysis). It seems that it’s not the case. In the revised version R2, we provide more details about “technical modification” does mean (gas leakage, broken bottles, electrical shutdown, failure of devices or sensors, problem of heating system, ...).

 

L 465-466. Unclear what this means. By "will be" do you mean in future work or below?

 

Authors agree with this comment. The effect of the inoculum is now better discussed in the revised manuscript.

 

R2: OK.

Thank you.

 

L 468-477. This discussion on suspected moisture absorption by sodium acetate, and the resulting poor quality of all positive control results, is not very convinving to me. Were results more variable than for the complex substrates? Was the acetate stored in open containers? Would moisture also be a problem for the complex substrates? Anyway, if you did not use the sodium acetate results at all, perhaps just mention the substrate (and the problems) earlier on and note that the data were not used and will not be presented. Then avoid repeating it.

 

Authors agree with this comment. Results were not more variable than for the complex substrate. Monohydrate sodium acetate is more hygroscopic than the dried and shredded substrate SA’. The acetate was purchased from one commercial lot, aliquoted in small boxes without desiccant then sent to all participants, and finally stored at room temperature. The hydratation of sodium acetate was just a hypothesis. As suggested, we removed it in the text.

 

R2: You have removed some of the information but not all, and the explanation is missing. Now the paper seems even more confusing with respect to this point. For example, there is still some text on it around line 729, and it is included in Fig. 2. Please try to address this point. See earlier recommendation.

The sentence “Indeed, the moisture of sodium acetate used here as positive control, which was not measured by the participants, may have varied during the transportation or storage according to its more or less important exposure to humidity.” was now removed in the second version. We keep in the Figure 2 because this figure describes the protocol followed by all the labs and doesn’t anticipate the results of the positive controls. The explanation is still missing, even if we think that the moisture of acetate could be the origin but with no numerical evidence.

 

L 478-482. Microcrystalline cellulose is a popular positive control and has been recommended. It would make sense to discuss it here.

 

Authors agree with this comment. This point is now discussed in the revised version.

 

R2: OK.

Thank you.

 

L 484-494. Show these results. In the least, mean differences (at least one is given for the AMPTS result) and p values would be helpful. Plots and tables might be useful as well.

 

Authors agree with this comment. The results of ANOVA with 4 factors carried out for the 3 substrates are now included in this part, showing p-values < 0.005 for the variable Method (automatic vs manual) and a discussion about the significance of such results obtained from a low statistical size is also added.

 

R2: The figures are particularly helpful.

Thanks for this comment.

 

Discussion. Much of this repeats information given elsewhere. In places it seems more like an introduction, and an extended abstract elsewhere. Consider removing the repeated parts and much of the review. Here is what Water gives in the guide for authors:

"Discussion: Authors should discuss the results and how they can be interpreted in perspective of previous studies and of the working hypotheses. The findings and their implications should be discussed in the broadest context possible and limitations of the work highlighted. Future research directions may also be mentioned. This section may be combined with Results."

 

Authors agree with this valuable comment. Discussion is now merged with Results as recommended by the reviewer.

 

R2: OK.

Thank you.

 

In your discussion text, connections to the introduction and objectives could be more clear. For example, you stress earlier in the paper that this study uniquely measured intra-laboratory reproducibility but what is, ultimately, the significance of the estimates? A discussion of limitations and what is needed in future work would be useful also. I recommend discussing issues related to positive controls and lack of replication of the blanks (if I have understood your description correctly). Additionally, the limitations of protocol standardization/harmonization alone and a possible need for effective validation criteria seems relevant here. Please revise. Also consider whether a combined results and discussion section would be more effective.

Authors agree with this comment. Results and Discussion are now merged in the revised version, hoping that the understanding is improved for the reader.

 

R2: OK.

Thank you.

 

L 552-559. This seems more like a typical discussion. But I don't think it is true that the microbial consortium (i.e., the inoculum) is the only part of the tests that differ among labs. There could be bias in the various measurement methods used, or the particular way each laboratory applies them. Some assessment of overall measurement bias for laboratories (e.g., the bivariate plots mentioned above would more clearly show whether labs tend to be high/low for all substrates or whether their response varies among substrates) could be helpful here. Some more discussion on studies that assessed inoculum effects on BMP could be useful here. Consider these papers: https://doi.org/10.1016/j.biortech.2017.06.142, https://doi.org/10.1111/1751-7915.12268, https://doi.org/10.3390/app10072589, https://doi.org/10.1016/j.biortech.2013.07.051, https://doi.org/10.1016/j.biortech.2012.01.025, https://doi.org/10.3390/w12061752 .

 

Authors agree with this comment. The text was modified in the revised version and some references proposed by the reviewer and some others were added.

 

R2: OK.

Thank you.

 

L 560-563. Do you think this drying conclusion is a general one, that would apply to other substrates? Has it been assessed in other studies, with other substrates?

 

We do not know if that would be apply to other substrates, probably yes, with exception for substrate that contains volatile compounds such VFA for example.

 

R2: OK.

Thank you.

 

R2: OTHER (NEW in R2) COMMENTS BELOW

Thanks for these new valuable comments that help us to improve again the quality of this manuscript.


R2: L 38 "More" -> "Moreover"?

Thanks for this comment. The change was made in the revised version according to the comment: “Moreover, statistical analyses of all the results, after removal of the outliers (about 15 % of all observations)”.

 

R2: L 39 "more or . . . BMP" -> "about 15% of all observations" (or observations -> records)

Thanks for this comment. The change was made in the revised version according to the comment: “Moreover, statistical analyses of all the results, after removal of the outliers (about 15 % of all observations)”.

 

R2: L 42 "is low" -> "was low"

Thanks for this comment. The change was made in the revised version according to the comment: “. On the other hand, the average intra-laboratory repeatability was low…”

 

R2: L 54-55. "alone or" -> "alone or through"

Thanks for this comment. The change was made in the revised version according to the comment: “…of organic substrates (alone or through codigestion)…”

 

R2: L 55-56 "mainly ... methane" -> "mainly composed of methane and carbon dioxide"

Thanks for this comment. The change was made in the revised version according to the comment: “…in this way biogas (mainly composed by methane and carbon dioxide) and digestate.”

 

R2: L 274. Delete "each".

Thanks for this comment. The change was made in the revised version according to the comment: “For both phases,… ”

 

R2: L 199-201. Delete (redundant)

Thanks for this comment. This sentence “The methanogenic potential, usually called BMP (Biochemical Methane Potential) corresponds to the amount of methane produced by an organic substrate during its biodegradation under anaerobic conditions.” was deleted in the new revised version.

 

R2: L 206. You state that the majority of labs used a closer bioreactor (batch test). That implies some did not--is that true? Can you clarify?

We agree with this valuable comment. “majority of”  and “a closed” were removed in this sentence: “The measurement methods were based on the cultivation in bioreactors of well-known quantities of organic material and anaerobic microorganisms (inoculum),…”

 

R2: L 221. "pretreatment" implies e.g., hydrothermal treatment or grinding. Consider changing wording.

Thanks for this comment. We agree with this comment and the change was made in the revised version according to the comment: “The preparation steps of substrates were the same for automatic or conventional BMP tests. “

 

R2: L 223-225. Consider moving this info on gas volume standardization to a calculations section.

A new calculation section (2.2.4. Experimental data calculation) was introduced that includes this sentence and the formulas mentioned earlier.

 

R2: Fig. 1. Can you use the same terms/phrase for steps 3 and 4?

Thanks for this comment, the Figure is now modified according to this request as following: ”Proposal of harmonized protocol.”

 

R2: Fig. 2. Consider removing positive control portion.

We disagree with this comment. For us, this description is needed and is a part of the organizational schedule. We prefer keeping positive control portion in the Figure 2 because this figure describes the protocol followed by all the labs and doesn’t anticipate the results of the positive controls.

 

R2: L 289-290. Name of files is irrelevant. Remove.

We agree with this comment. The sentence is now deleted.

 

R2: L 302. You wrote "substrate" but do you mean "inoculum"?

This point concerns effectively the substrate and not inoculum.

 

R2: L 317. Delete second "also".

Thanks for this comment. The change was made in the revised version according to the comment: “…and graphs were also automatically plotted for each set (substrate, blank, positive control) …”

 

R2: L 323. Can you provide location/city for Ondalys?

Thanks for this comment. The location is now provided in the revised version according to this request as following: “All Excel files were collected and treated by the Ondalys company (Clapiers, France), partner of the inter-laboratory study.”

 

R2: L 339. "for some factors weak" -> "small".

Thanks for this comment. The change was made in the revised version according to the comment: “The sample size for some factor levels, especially for the experimental system (Manual/AMPTS), was for some factors small.”

 

R2: L 372. "kept for" -> "removed from"?

Thanks for this comment. The change was made in the revised version according to the comment: “The basic idea behind these rules was these results would not have been provided to a customer, and which were removed from statistical analysis because there was no objective reason to rule them out.”

 

R2: L 377-380. Not clear if organizers or individual participating labs removed these observations.

All the observations were removed by the labs themselves, directly or after a discussion with Ondalys and organizers that suggest to remove some datasets. The final decision was always taken by the lab.

 

R2: L 385. "deposits" is not clear.

Thanks for this comment. The change was made in the revised version according to the comment: “The solid organic substrates studied in anaerobic digestion projects were highly variable depending on the feedstocks considered.”

 

R2: L 396. "industrial" not clear.

Thanks for this comment. The change was made in the revised version as following: “Commercial mayonnaise was chosen.”

 

R2: L 403. "was" -> "is"

Thanks for this comment. The change was made in the revised version according to the comment: “The composition of these substrates is given in Table 1.”

 

R2: L 438-441. Calculation of theoretical BMP belongs in methods.

We disagree with this comment, from our point of view the calculation of theoretical BMP doesn’t belong to Methods and must be considered in Results section.

 

R2: Table 3. Consider moving theoretical BMP to Table 1. Seems out of place here in Table 3.

We disagree with this comment. For the same reason mentioned previously, we keep the theoretical BMP in Table 3 and doesn’t want to move them in Table 1.

 

R2: Fig. 3. Consider giving counts (number of observations or labs) instead of percentages in the printed numbers (e.g., 31% -> 4).

The figure is now modified as requested showing now numbers instead of percentages. The caption is also modified as following: “Numbers refer to the 13 lab results.”

 

 

R2: Fig. 3. On line 216 you state that some lab(s) used gravimetric methods but that is not shown here.

You’re right, no gravimetric methods were used in this study but the mentioned line refers to the introduction of BMP measurement and we think it’s important that gravimetric methods exist. In order to clarify, the sentence was modified as following: “The methanogenic potential of the substrates can be determined using manual (volumetric or manometric measurements), automatic [30, 49, 50, 53] or gravimetric methods [51-53].”

 

R2: Fig. 4 and elsewhere. "Labnumber" -> "Lab number" or "Lab ID" or "Lab key" or "Lab code". . . (space is important part)

Thanks for this comment. The change “Lab ID” was made in the Figures 4, 5, 7 and 8 in the revised version according to this comment.

 

R2: Fig. 4. Figure caption does not really explain what is shown. Can you add "Summary of BMP values measured . . . showing" and then you can include you "intra-laboratory. . ."?

Thanks for this valuable comment. The caption was modified as proposed for Figure 4 and Figure 7: “Figure 4. Summary of BMP values measured for the results of the first phase – free protocols showing intra-laboratory repeatability and intra-laboratory reproducibility: (a) SA; (b) SA’; (c) SB.” and “Figure 7. Summary of BMP values measured for the results of the second phase – harmonized protocol showing intra-laboratory repeatability and intra-laboratory reproducibility: (a) SA’; (b) SB; (c) SC.”

 

R2: Fig. 4. Are data missing? Fig. 5 shows results for lab 11 SA but these are not shown in Fig. 4. Please double-check.

Data are not missing but that corresponds to the removed values given in Table 4. This is now explained in the text: “A significant number of outliers had been ruled out (22% for SA, 13% for SA’ and 24% for SB) that explained why some data are missing in Figure 4, for instance set A for substrate SA for lab ID 1.”

 

R2: L 534. 6% -> 7%?

Thank you for this correction. We modified our mistake in the new revised version: “…were respectively 4 - 7% and 6 - 9%.”

 

R2: Fig. 6. Caption needs more information. Are these means values plus standard deviation (bars) by lab from phase 1? Why are only 9 labs shown? Could you indicate which point shows the single value that influenced the ANOVA results?

Caption was modified as following: Scatterplot SA vs SA’, mean values and standard deviation for each lab.

Results of Lab ID 11 were not shown in Figure 4 and Figure 6, this is the lab influencing the ANOVA results. You can find join the scatterplot with this lab but we don’t think it’s interesting to show that in the paper.

Figure 6. Scatterplot SA vs SA’, mean values and standard deviation for each lab

 

These mean values and standard deviation by lab concern effectively the phase 1, the substrate SA was only tested in the first phase.

Nine labs have been concerned by this scatterplot because:

  • lab ID 13 had no test for SA
  • lab ID 7 and 8 worked on a slightly modified SA substrate, dried but not grinded, and it was decided to not include these results is the comparison graph between SA and SA’ and in the ANOVA.

 

R2: L 560. State apparent nutrient solution effect (e.g., mean difference as % with p value).

Thanks for this comment. We provided this information in the new revised version of the manuscript and modified the sentence as following: “The sole parameter showing a low effect, higher BMP values (about 9% for the three substrates with limited p-values), was the use or not of the mineral nutrient solution, more than the use of a buffer solution. The values were respectively given for with vs without mineral nutrient solution for SA, SA’ and SB: 414 vs 445 (p-value = 0.011), 389 vs 421 (p-value = 0.008) and 257 vs 281 NmL.gVS-1 (p-value = 0.005).“

 

R2: L 596. This implies that pH always has to be measured in all bottles. Is that correct?

Thanks for this valuable comment. This detail was not discussed when proposing the harmonized protocol, but from our point of view it was evident that the pH must be measured in all bottles or at least in one bottle per triplicate. We do prefer to not modify the already published protocol and to keep our sentence without this detail in our manuscript: “The final pH value had to be higher than 6.5”.

 

R2: L 602. "the third" -> "one-third"

Thanks for this comment. The change was made in the revised version according to the comment: “The methane part due to the endogenic activity wouldn’t exceed one-third of the whole methane production of the assay.”

 

R2: L 685. Wording is not clear. Seems to suggest that Hafner et al. made some statement about your Table 3, which does not make sense.

Thanks for this valuable comment. We modified the sentence according to the comment as following: “As shown also by Hafner et al. [46], the substrate volatile solids measurement reported here (Table 3) did not impact the observed variability.”

 

R2: L 712-718, L 809-828, possibly elsewhere. This is a major problem that must be corrected. Some of the language seems different in these sections than it does in the rest of your paper. A Google search of some of the text surrounded by quotes, e.g. "In contrast, a laboratory effect was clear from the ANOVA, and it was much larger than the mean inoculum" shows that at least some of the text is identical to text in ref. 46, while other sentences are nearly identical. Please check the complete manuscript for other cases where this may be a problem. These sections need to be rewritten (not simply changed by modifying one or two words in each sentence).

Thanks for this comment, you’re right and we apologize for this. We borrowed some parts of papers always with the citation of the reference, acting precipitately to meet the deadlines of the revision and not checking sufficiently this point. We double checked the whole manuscript after reading again carefully the cited publications.

 

R2: L 747. Do these two papers really give identical validation criteria? I think they are in fact different. Please double-check.

Thanks for this comment. We only keep the recommendation validation criteria given by Hafner et al. (2020): “Finally, Hafner et al. [46] recommend firstly a relative standard deviation for cellulose BMP not higher than 6%, and secondly a mean cellulose BMP between 340 and 395 NmLCH4 gVS-1.”

R2: L 786. Also consider that the measurement methods etc. were not randomly assigned.

Thanks for this valuable comment. The sentence was modified as following in the new revised version: “In this way, it was quite difficult to consider these conclusions as significant because of the too low statistical size, and to a not randomly assigned measurement methods and parameters.”

 

R2: Fig. 9. It would be helpful to have the sample size (n = number of labs) included as text in these plots.

The sample size was already given in Table 9. The sample size is more the number of tests instead of the number of labs. From our point of view, this information will be redundant if we write it in the caption.

 

R2: L 790 and maybe elsewhere. Instead of "influence of x on y" you might use "correlation between x and y". This is more appropriate because treatments were not randomly assigned.

Thanks for this comment. We kept the influence word in several sentences except in the following ones:

“The effect of these factors was studied by ANOVA analysis and was detailed below.“

“Finally, the results of ANOVA with four factors for SA’, SB and SC showed there was no correlation of the inoculum endogenous activity for all the laboratories...”

 

R2: L 835. Consider deleting "common".

Thanks for this comment. The change was made in the revised version according to the comment: “This work results from inter-laboratory assays on three substrates.”

 

R2: Appendix A. I expect that this material should go in an online supplement, but I expect the editors or publication office will tell you to make this change.

For us this material is at the right place, due to the fact it’s an important part for the understanding. We consider that these data should be placed in Appendix as proposed in our first revised version and not as Supplementary data as mentioned in the template: “The appendix is an optional section that can contain details and data supplemental to the main text. For example, explanations of experimental details that would disrupt the flow of the main text, but nonetheless remain crucial to understanding and reproducing the research shown; figures of replicates for experiments of which representative data is shown in the main text can be added here if brief, or as Supplementary data. Mathematical proofs of results not central to the paper can be added as an appendix.”

Author Response File: Author Response.pdf

Back to TopTop