Next Article in Journal
Mechanical Properties of Adjacent Pile Bases in Collapsible Loess under Metro Depot
Previous Article in Journal
Efficient Sleep–Wake Cycle Staging via Phase–Amplitude Coupling Pattern Classification
 
 
Article
Peer-Review Record

Efficient and Intelligent Feature Selection via Maximum Conditional Mutual Information for Microarray Data

Appl. Sci. 2024, 14(13), 5818; https://doi.org/10.3390/app14135818
by Jiangnan Zhang 1, Shaojing Li 2, Huaichuan Yang 2, Jingtao Jiang 3 and Hongtao Shi 2,*
Reviewer 1: Anonymous
Reviewer 2:
Appl. Sci. 2024, 14(13), 5818; https://doi.org/10.3390/app14135818
Submission received: 15 May 2024 / Revised: 24 June 2024 / Accepted: 2 July 2024 / Published: 3 July 2024
(This article belongs to the Section Applied Biosciences and Bioengineering)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors propose a method for feature selection based on ML algorithms.

Unfortunately, the presented work has several weaknesses.

Major problems:

1. The fundamental function i.e. mutual information is not defined. The reader can deduce that it is a positively defined additive in the feature parameters, but is not explicitly defined. I believe that the definition exists because the authors provide the results of calculations.

2. The results are not verified. The authors claim that the proposed method would be capable of distinguishing parameters/features crucial for future development. This problem is not discussed. I expect that the final set of parameters will be verified against the aim of the research - they make it possible to distinguish some genes/features biologically important.

Minor problems:

1. It would be expected to clarify the the usage of ML algorithm. They are sensitive to technical elements - no of layers, size of layers, size of the training set, size of test set etc. It is important because the paper is based on very limited data, so the technical details should be included.

2. l. 375 A very strong conclusion about "significant superiority" should be followed by a comment that it is related to a very limited set of examples and is not general.

3. In tables of execution time the unit is missing. Besides that, this result is not important. For potential users, it would be enough to inform that for around half of minute. Of course, unless my guess that it is in seconds is correct.

4. Figures are split between pages.

5. Minor punctuation misprints.

 

 

Comments on the Quality of English Language

1. Figures are split between pages.

2. Minor punctuation misprints.

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This article addresses feature selection for microarray data analysis. The authors propose a new algorithm and evaluate it with three databases, comparing their results to classic feature selection algorithms. The article is interesting and has the potential to be published. However, I recommend some changes before it is accepted.

 

First, I think the authors should name the algorithm and cite it in the abstract. In some parts of the text, they use the acronym MCMI to define the algorithm, but this is not clearly defined. If you decide to use this name, include the acronym MCMI in the abstract.

 

The main problem with this article is the large number of acronyms, which can generate a lot of confusion for readers. Some acronyms are cited in the tables and figures but are not defined in the caption. Also, some acronyms are unnecessary because they are rarely cited (such as CV: cross-validation). I recommend the authors create a section at the end of the text to summarize all the cited acronyms. Furthermore, all acronyms cited in tables must be defined again in the table caption.

 

Section 2 presents the algorithm, but it is unclear how the "greed search" shown in Figure 1 works. Could you provide a better example of this in the text?

 

Line 256: Please, transpose Table 1 (i.e., in the first line, include "dataset", "acronym", "raw data type", "feature number", and "sample number"). Also, include the number of classes.

 

In table 2, what are the acronyms: Rel, NRF, FN, OA? Define these acronyms in the table title or add them to the table footer.

 

Does Table 2 present the results of the classification models without feature selection? (did you test this?) Or is it the result of your algorithm? That is not clear.

 

Are all these decimal places really necessary? I recommend including only 3 decimal places.

 

In Table 3, what are OA and FN? Include the description in the caption.

 

Figures 2 and 3: why didn't you include your algorithm in the graph?

 

Line 509: I recommend providing access to the data used and the algorithm's implementation in a public repository such as GitHub or Zenodo. Make the access link available in the "Data Availability Statement" section.

 

Minors:

 

lines 11-13: where you see: "Addressing this, we propose a novel feature selection algorithm based on maximum conditional mutual information, aimed at identifying a minimal feature subset that is maximally relevant and non-redundant"

 

change to: "Addressing this, we propose a novel feature selection algorithm based on maximum conditional mutual information to identify a minimal feature subset that is maximally relevant and non-redundant"

 

lines 99-110: numbers 1, 2, and 3 are repeated twice.

 

line 204: "O(N)when" change to "O(N) when"

 

Line 306: "Experimntal" change to "Experimental"

 

LINE 306: "alogorithm" change to "algorithm"

 

Line 327: "datasets[10,11]." change to "datasets [10,11]."

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors answered the raised questions and introduced necessary changes.

Although the subject and amount of data are not fully convincing the paper can be published in the present form.

The last issue which should be corrected is the layout of the paper but this is only a technical issue.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have addressed all my concerns. Therefore, I recommend that the article be accepted for publication.

Back to TopTop