A Machine Learning-based Pipeline for the Classification of CTX-M in Metagenomics Samples
Round 1
Reviewer 1 Report
The manuscript submitted by Ceballos et al. represents an interesting
study where the authors describe a machine learning pipeline for the
classification of CTX-M in metagenomics samples.
The study is not adequately presented and lack of novelty. I have
numerous concerns, that I'm going to present below.
Comment:
* What are CTX-M genes and why you chose to develop a pipeline for
such genes? It is not clear in the main text, and the reader does
not have to google the acronym for understanding the objective of
the study. So please, define it and better describe your aim in the
introduction section.
* Please revised the format of the articles by improving the
description of the methods and by describing (and not inserting) the
Figures in the main text.
* The English needs to be improved, in particular for section 2.1.
* Section 3.1 and 3.2 cannot be presented in this way.
* Please provide the data you used, with a clear reference. Authors
must have the possibility to repeat your experiments.
* Several words are written with a "-" in the middle of the words.
Please correct them.
Author Response
The recommendations given will be taken and the file with the answers for each item will be attached.
Author Response File: Author Response.pdf
Reviewer 2 Report
In this paper, the authors presented a frame work and computational pipeline to identify CTX-M gene groups. The pipeline seems to be useful, but there are many parts in the paper is not clear and the English need to be re-edited extensively.
1. It is mentioned in the paper that the training set was selected based on CTX-M database, but how the training set constructed is still not clear. For example, how big is the training set? How the authors determine the class groups in the training set ? More details need to be described.
2. For the neural network part (figure 3), it is not clear to me what the input is. What are the Xs in figure 3. If it is k-mer sequences, how the authors convert sequence data into X which is range from 0 to 1.
3. The purpose of showing figure 4 is not clear to me. There is also no description in section 3.1
4. For section 3.2, it is mentioned that the model was trained based on different learning rate, training epocs etc but the detail results are missing.
5. It is not clear to me the benefits of using this pipeline, how the performance looks like when comparing with other similar tools ?
6. Documents of how to use the pipeline is missing.
Author Response
The recommendations given will be taken and the file with the answers for each item will be attached.
Author Response File: Author Response.pdf
Reviewer 3 Report
The manuscript by Ceballos et al “A machine learning based pipeline for the classification of CTX-M in metagenomic samples”presents the ANN based model for the classification of CTX-M group of genes. The manuscript is poorly written and have several issues with the manuscript in current state. See the comments below.
Major comments
1. Authors failed to clearly explain the problem and their ANN based model. So authors should explain in detail what is the input to the model and outcome.
2. Authors used the data from database but failed again to provide complete details.
3. Authors mentioned about activation functions, “Tang” and “ELU” and I think these are “Tanh” and “RELU” respectively if not authors should define their functions.
4. Authors presented the ROC curves for training data and not sure whether they tested the model on independent test data.
5. Table 1 shows the precision test value as 1000 for ELU activation function it can’t be true as this value should be in the range of 0-1.
6. Figure 6, legend shows data for 10 classes but the plot shows only 4 curves, what happened to the remaining 6 classes?
7. The language of the manuscript is in bad state and should be edited by native English speaker or professional editing service. A lot of formatting errors in the text are present throughout the manuscript, so should be fixed.
Minor comments
1. There are subsections in the introduction that is not standard of the journal formatting guidelines.
2. A lot of words are split with hyphen, not sure is it because of an error, for instance, Concern-ing, under-stand and many words. So these words should be written with out hyphen.
Author Response
Thank you for your comments, we will proceed to make the necessary corrections.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The manuscript submitted by Ceballos et al. has been ameliorated followed my suggestions, however there are some points that have not been discussed properly (See below)
Comment:
- Please define the CTX-M genes in the Introduction Section (e.g. CTX-M-type enzymes are a group of class A extended-spectrum β-lactamases(ESBLs) that are rapidly spreading among Enterobacteriaceae worldwide.
- Please names of genera and species must be in italic.
- Please, rephrase line 114-130. The text appears disjointed
- Where Figure1 is?
- Figure 2 must to be described in the main text.
- Figure 3 is not cited in the main text.
- Paragraph 3.1 needs to be improve with text. It cannot contain just a figure.
- Paragraph 3.2 contains different figures. Ideally, the readers must understand your work just reading the text. The figures are ok for better understanding your workflow and results, but they cannot be the exclusive content of your article.
- Change the subtitle “Future work” in “Perspective”
Author Response
Thank for your comments, respecto to your comment:
"Paragraph 3.2 contains different figures. Ideally, the readers must understand your work just reading the text. The figures are ok for better understanding your workflow and results, but they cannot be the exclusive content of your article."
I considere that this figures is important to show the results with more clarity
Author Response File: Author Response.pdf
Reviewer 2 Report
Some minor comments
English still need to be improved.
line 25, "later implementation in the TensorFlow framework, and the
visualization of the behavior of the Artificial Neural Networks in TensorBoard". This sentence is hard to understand. May change to "later implemented in the TensorFlow framework and the behavior of the Artificial Neural Networks was visualized in TensorBoard "
Figure 2. Details of the computational pipeline. It would be great to brief introduced each step.
line 219. Should list it as a table
Each element in figure 4 is hard to see. May need to increase the resolution.
Author Response
Thank you for your comments.
I have make the adjust with reference to your comments except "line 219. Should list it as a table " Because i think that in the next paragraph a table already exists.
Author Response File: Author Response.pdf
Reviewer 3 Report
Authors addressed all my concerns providing sufficient information. Few typos are there. Still authors use "tang" and "ELU" functions and should be fixed before acceptance. Figure quality is in pretty bad state for almost all figures.
Author Response
Thank you for your comments, i have do the adjust.
Author Response File: Author Response.pdf