Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

ProLSFEO-LDL: Prototype Selection and Label- Specific Feature Evolutionary Optimization for Label Distribution Learning

Appl. Sci. 2020, 10(9), 3089; https://doi.org/10.3390/app10093089

by Manuel González^1,*

, José-Ramón Cano² and Salvador García¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2020, 10(9), 3089; https://doi.org/10.3390/app10093089

Submission received: 5 April 2020 / Revised: 24 April 2020 / Accepted: 26 April 2020 / Published: 29 April 2020

(This article belongs to the Special Issue Data Preprocessing in Pattern Recognition: Recent Progress, Trends and Applications)

Round 1

Reviewer 1 Report

The article entitled “ProLSFEO-LDL: Prototype selection and Label-Specific Feature Evolutionary Optimization for Label Distribution Learning” discuss application of combined feature and prototype selection mechanism to improve the quality of label distribution learning (LDL) problem. In the proposed solution authors use CHC evolutionary algorithm. This algorithm was previously used by the authors to solve the instance selection problem in classification task with great success. So the extension and adaptation of this algorithm to new challenge sims to be natural and right direction.

The problem of LDL has been defined relatively recently so it is good to see that the authors conduct research in this area.

In general the article is very well written, well-structured and the experiments are well conducted with proper validation of the results. It do not require any significant improvements, except one element which requires better explanation.

My request is related to better explanation how the cross-validation procedure was carried out. In the text it is written that the results are the average over 10-fold cross-validation (CV) procedure, but it is also indicated that the fitness function also used CV to estimate its value. So please explain if the final results were obtained using wrapped cross-validation (one over entire process, and second internal CV just to estimate the fitness-function) or the final results were derived from the internal CV used when evaluating the fitness-function. For the second option there is a danger that the results are overestimated.

I’m also a bit worried about the length of the chromosome which sims to be very long leading to the convergence problem for larger datasets, but as I understand that is out of the scope of this article.

I also suggest some minor changes which may improve the readability. These are:

I suggest to explain what the is the threshold t (which appear in Algorithm 1) earlier in the text. Now this parameter is explained two pages after it is used. I suggest to add a short note about it in the beginning of section 3. Please also suggest what are the typical values of threshold t.
I suggest to explain what is the 10-fcv when it is used for the first time. Now it is used for the first time on page 7 in line 248, but it is explained in in line 323 (page 10).

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

This paper presented a data reduction algorithm that adapts to LDL constraints by addressing the prototype selection and the label-speciﬁc feature selection using AA-kNN learner. The results were corroborated by statistical tests and signiﬁcant improvement of prediction time.

Overall, the paper is well-written. Ways to improve for readability and applicability in practical applications:

Present some practical scenarios
Compare efficiency in accuracy with existing optimisation techniques for benchmarking
Time complexity is only comparing with raw datasets
Checks on readability and sentences

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The manuscript describes an evolutionary algorithm to jointly perform prototype (instance) and feature selection as data reduction technique. It is used to identify the most representative training samples and features while simultaneously mitigating the negative effect of noise and irrelevant labels.

The approach is specifically designed to work in conjunction with the AA-kNN learner.

The paper is overall well written and concepts are adequately explained. The experimental validation is carried out correctly and rigorously, comparing the performance of AA-kNN with and without using the proposed data reduction scheme.

The experimental results show that the proposed approach leads to a slight performance increase in most of the cases using, on average, half of the training data and features, which supports the authors proposal.

However, some other approaches are referenced in the manuscript that account for the same problems, either jointly or separately. I feel the conclusions of the paper would have been stronger if compared with other approaches addressing the same problem, in order to highlight the merits of the proposed approach. Indeed, other recent approaches (e.g. [40]), seem to be more effective in terms of the reported metrics. I believe a discussion on this is needed to complement the presentation.

One possible limitation that could be of interest to consider in the future is that the feature selection is label-specific; in my understanding, this means the labels are used in the algorithm to select the best features. How would the presence of noisy (wrong) labels affect the performance? It would be interesting to inject some noise to the training labels so to assess the robustness of the approach. In fact, in real scenarios, when collecting large amounts of data the labeling is usually supported by some automatic procedures, that are likely to lead to incorrect sample labeling.

In any case, I believe this study provides a contribution worth publishing. My recommendation is to accept the paper after a revision; please include results from other approaches in the revised version and provide a detailed discussion.

Typo: page 4 line 155: eliminats -> eliminate

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Article Menu

ProLSFEO-LDL: Prototype Selection and Label- Specific Feature Evolutionary Optimization for Label Distribution Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI