Next Article in Journal
Hunting Search Algorithm-Based Adaptive Fuzzy Tracking Controller for an Aero-Pendulum
Previous Article in Journal
New Upgrade to Improve Operation of Conventional Grid-Connected Photovoltaic Systems
 
 
Article
Peer-Review Record

Inference Analysis of Video Quality of Experience in Relation with Face Emotion, Video Advertisement, and ITU-T P.1203

Technologies 2024, 12(5), 62; https://doi.org/10.3390/technologies12050062
by Tisa Selma 1,*, Mohammad Mehedy Masud 1,*, Abdelhak Bentaleb 2 and Saad Harous 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Technologies 2024, 12(5), 62; https://doi.org/10.3390/technologies12050062
Submission received: 28 March 2024 / Revised: 20 April 2024 / Accepted: 23 April 2024 / Published: 3 May 2024
(This article belongs to the Section Information and Communication Technologies)

Round 1

Reviewer 1 Report (Previous Reviewer 1)

Comments and Suggestions for Authors This study investigates the impact of end-to-end encrypted video streaming on the ability of network administrators to ensure user quality of experience (QoE). It proposes a novel approach using facial emotion recognition (FER) to investigate QoE and analyse the impact of advertising. It trains machine learning models using open-source FER data, real network conditions and user feedback during video insertion. While this approach offers a promising solution for improving user satisfaction in video streaming with end-to-end encryption, several issues must be addressed, including: (1) In the abstract, please highlight the novel contributions and advances within the research area. (2) Authors are encouraged to clearly articulate the primary challenge of facial emotion in video advertising approaches, a problem that this study effectively addresses. (3) Emphasise the practical implications of your findings, discussing how your study can be analysed in real-world applications and its potential impact on future research and development. (4) Discuss broader applications of your findings beyond the datasets and specific tasks tested, including potential adaptations of your model to related areas of facial emotion or other fields. (5) I believe that the strength of the idea can be better demonstrated if the authors present the results on the basis of different video resolutions: 480p, 1080p or up to 4K. Some ML algorithms report positive results for sequences with a high degree of complexity, but they face a challenge with simple, short and unique sentences. (6) Besides, some researches are recommended for citation: 1. https://doi.org/10.1109/ACCESS.2020.2965099 2. https://doi.org/10.1049/ipr2.12980 3. https://doi.org/10.1145/3531015 As regards presentation, the author needs to improve the transitions between sections to make the narrative flow more smoothly and to ensure that each section builds logically on the previous one.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

Thanks for your revision. I would like to follow up with the question regarding the contribution of this paper.

In the authors' reply, they claim, "We are not proposing any new machine learning algorithm rather we are proposing the framework starting from data collection, data preprocessing, feature extraction, training, and evaluation."
They also claim that "Our contributions include a novel machine learning approach for QoE estimation in video content, leveraging facial expressions, ad data, and ITU-T P.1203 video quality metrics."

 

What is the novelty of the proposed machine learning approach if it is not new? Note that the selection of input data (e.g., the usage of facial expression or not) is not considered as a novelty. The selection of features is also not considered as a novelty unless the paper proposes a new feature selection method instead of an experimental comparison using the existing methods.

Comments on the Quality of English Language

NA

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report (Previous Reviewer 1)

Comments and Suggestions for Authors

All problems have been addressed

Reviewer 2 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

NA

Comments on the Quality of English Language

NA

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This work suggests a novel method that uses user facial emotion recognition to deduce QoE and study the effect of ads. We use open-access Face Emotion Recognition (FER) datasets and extract facial emotion information from actual observers to train machine learning models. Some problems are required to addressed:

Major:

1.  The objective of this work is puzzling; The ITU-T P.1203 as the first released quality assessment standard, which test conditions is old-school compared to the current environment. In fact, P.1203 is limited to 1080p and does not support the H.265 standard, which is out of place. Why not discuss ITU-T.P1204 instead of P.1203?

2.  Please provide a complete describe about the “ML ALGORITHMS” in Figure 6 that is of most interest to the reader. How supervised learning is connected in your model architecture. For example, convergence details in classification learning, etc.

Minor:

1.       Authors are encouraged to review the paper and correct the typos. There are some strange “Error! Reference source not found” appearing in the first section.

Overall, the author needs to improve the presentation, there is a lot of redundant information in this content. Besides, it seems that this work presents a social study rather than a research work.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper suggests a method that uses user facial emotion recognition to deduce QoE and study the effect of ads. It applies open-access FER datasets and extracts facial emotion information from actual observers to train machine learning models. Various machine learning methods are studied in Table 10. It claims to propose an accurate machine learning model that can estimate QoE by considering advertisements and user expressions. However, it is not clear what machine learning model is "proposed" in Sections 3.4 and 3.5. It seems to be an experimental comparison paper. In addition, Table 10 uses different train-test split ratios in the experimental setups. This might not be a fair comparison.


The paper claims that it compares and analyzes the experimental results based on the ITU-T P.1203 standard and existing studies [6]. Please clarify what the difference is between this paper and [6].

Some paragraphs could be simplified, such as the last few paragraphs in Section 1.

 

 

Comments on the Quality of English Language

NA

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop