Next Article in Journal
Data-Driven Modeling of DC–DC Power Converters
Previous Article in Journal
Digital Active EMI Filter for Smart Electronic Power Converters
 
 
Article
Peer-Review Record

A New Joint Training Method for Facial Expression Recognition with Inconsistently Annotated and Imbalanced Data

Electronics 2024, 13(19), 3891; https://doi.org/10.3390/electronics13193891
by Tao Chen 1, Dong Zhang 1,* and Dah-Jye Lee 2
Reviewer 1: Anonymous
Reviewer 3: Anonymous
Electronics 2024, 13(19), 3891; https://doi.org/10.3390/electronics13193891
Submission received: 8 September 2024 / Revised: 27 September 2024 / Accepted: 30 September 2024 / Published: 1 October 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors investigate the interesting problem of a multi-dataset joint method to improve the performance of Facial Expression Recognition (FER) systems. They address two specific challenges: the annotation inconsistency among different FER datasets and the class imbalance across datasets.

They propose an interesting approach named Sample Selection and Paired Augmentation Joint Training (SSPA-JT), aimed at solving both of these problems.

The article is well-written and organized, and the methods are clearly described. The authors present a comprehensive set of tests to demonstrate the performance of the proposed approach, which in most cases outperforms state-of-the-art results.

However, there are some aspects that deserve attention:

  1. In section 3, the authors do not clarify which method is used for Feature Extraction.
  2. In figure 3, it is unclear what LCE stands for.
  3. In section 3.1, they use Feedforward Networks to implement a mixture of experts. They should describe the type of network they use.
  4. In tables 3 and 4, there are some errors in the citations. For example, the DCJT method is cited in reference 15 instead of 16. The authors should review all citations in the tables.
  5. At line 361, there is a row misalignment.
  6. To improve the quality of the presentation, the authors should include in the article some images from the dataset used. It would be interesting to see an example of a noisy label.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This is my review of the paper: A New Joint Training Method for Facial Expression Recognition with Inconsistently Annotated and Imbalanced Data. While the paper addresses a compelling topic and is generally well-structured, several areas require improvement:

1. The abstract’s introduction is too abrupt. It would benefit from a brief statement highlighting the importance of the domain before diving into the specifics of your approach.

2. The word “pervasive” may not be the most suitable choice to describe annotation inconsistencies. Consider using a more precise term.

3. Line 20: the intended phrase is "Rather than acquiring."

4. Lines 33-36: This sentence needs rephrasing for clarity. The current structure might lead readers to believe that "most classes (i.e., tail classes) have very few samples because certain facial expressions naturally occur more frequently in real life," which could cause confusion.

5. Line 37: When you mention "these datasets," it’s unclear which datasets you're referring to. Please clarify.

6. Regarding your contributions, claiming that you "discovered" the impact of class imbalance on classification performance seems overstated, as this is a well-known fact. Rephrase this to reflect the novelty of your specific contribution.

7. In Contribution 3, maintain consistency in phrasing by using "we" as in the first two contributions.

8. Figures should be introduced after providing relevant context. For instance, Figure 3 should be repositioned after it is first mentioned in the text.

9. Please provide a description or definition for the acronym "SLSQP."

10. In line 363, there’s a stray "clean samples" before the start of the paragraph. Please remove it.

11. Clarify the total number of samples used from AffectNet in your experiments. You mention 450,000 samples across 11 expressions, but in the experiments, only 7 expressions from the other two datasets are referenced.

12. Explain why you opted for macro versions of precision, recall, and F1-score, but not for accuracy. This rationale would strengthen the clarity of your evaluation choices.

13. Please provide justifications for the parameter values chosen in the section covering lines 433-439.

14. When comparing your method to state-of-the-art approaches, it would be helpful to include other performance metrics beyond those already discussed.

Comments on the Quality of English Language

Quality is good, with just some small issues mentioned in the comments.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

In this work, the authors propose a novel multi-dataset joint training method for Facial Expression Recognition (FER) that includes Sample Selection and Paired Augmentation Joint Training. While the approach is intriguing, I have several concerns that require the authors' attention:

1.     Please provide a specific percentage improvement in performance over existing techniques in the abstract, rather than merely stating the achieved results.

2.     Clarify the distinction between traditional active learning techniques, such as uncertainty-based sample selection, and the proposed method. A detailed discussion and comparison are necessary to highlight differences.

3.     In the methodology section, please include a clear algorithm for the sample selection process.

4.     In Tables 1, 2, and 3, please sort the comparison methods chronologically by year for easier reference.

5.     The proposed work appears quite similar to the approach in Reference 14. Please elaborate on how your work differs and clarify the novelty of your contributions.

 

6.     Many statements, especially in the introduction and literature review sections, are made without citing relevant work. The authors should ensure that related studies are properly cited, including works such as:  https://link.springer.com/chapter/10.1007/978-3-031-08341-9_33  and https://www.sciencedirect.com/science/article/abs/pii/S0010482524009077 .

Comments on the Quality of English Language

Minor editing of English language required.

Author Response

Please see the attachment. 

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have answered all of my concerns.
For the final version please recheck the revised version for potential syntax errors (such as missing spaces after full stops ".") which might have been introduced with new content.

Reviewer 3 Report

Comments and Suggestions for Authors

Thanks for the response. 

Comments on the Quality of English Language

 Minor editing of English language required.

Back to TopTop