Inference Analysis of Video Quality of Experience in Relation with Face Emotion, Video Advertisement, and ITU-T P.1203
Abstract
:1. Introduction
- To evaluate the effects of video advertisements on QoE—a reliable measurement of QoE is a fundamental step in multimedia communication. Considering all the factors that influence QoE, experiments are important;
- The experimental results were compared and analyzed based on the ITU-T P.1203 standard and previous studies [7];
- We propose an accurate machine learning approach that can estimate QoE by considering advertisements and user expressions.
- Obtaining subjective QoE is expensive in terms of money, effort, and time. Therefore, we attempted to offer an alternative solution for this problem;
- The ITU-T P.1203 standard QoE measurements did not anticipate significant factors that could lead to inaccurate results. We hope to alleviate this issue using our proposed approach;
- Excessive video advertisements during a video streaming session may weaken user engagement and negatively affect QoE. We attempted to devise a method in which a viewing session could coexist with several advertisements if a specific threshold was met.
- 1.
- Research Question 1: Leveraging Machine Learning for QoE Inference
- 1.1.
- Challenge: Existing methods such as ITU-T P.1203 offer limited accuracy owing to the lack of consideration of user emotions, advertisement data, and real-time video quality conditions.
- 1.2.
- Objective: To develop a novel ML approach that integrates ITU-T P.1203 results, facial expressions, and advertisement data to infer video QoE in real time.
- 1.3.
- Contribution: The ML approach was proposed and compared with state-of-the-art algorithms, demonstrating its superior accuracy and real-time applicability for video QoE inference.
- 2.
- Research Question 2: High-Quality Data Collection and Preparation
- 2.1.
- Challenge: Ensure the quality and diversity of the collected data while minimizing potential biases in the participant responses.
- 2.2.
- Objective: To collect and prepare a comprehensive and unbiased dataset that is suitable for effective ML model training.
- 2.3.
- Contribution:
- We designed and utilized online platforms to collect data from a large and diverse pool of participants (661 from 114 countries);
- We developed comprehensive questionnaires (100+ questions) to minimize bias and ensure data accuracy;
- We implemented data pre-processing techniques (feature extraction and attribute selection) to enhance data quality and model performance;
- We enriched the model with training data from various sources.
- 3.
- Research Question 3: Data Quality Improvement for Machine Learning
- 3.1.
- Challenge: Identify the most effective data pre-processing techniques for a specific dataset and model while balancing data quality and quantity, handling outliers, and addressing potential inconsistencies.
- 3.2.
- Objective: Enhance data quality to improve ML model performance and generalizability.
- 3.3.
- Contribution:
- Implementing appropriate data pre-processing techniques to address the identified challenges;
- Utilizing various data sources to enrich the model and enhance its accuracy and generalizability;
- Iterative ML experiments and adjustments were performed to optimize model training while considering the challenges encountered.
- 4.
- Research Question 4: Balancing Advertisement Placement and User Experience
- 4.1.
- Challenge: Understanding diverse cultural preferences and user behaviors regarding advertisements across different regions and demographics, while balancing the maximizing of advertisement effectiveness and preserving user engagement and video QoE.
- 4.2.
- Objective: Investigate optimal advertisement strategies that balance advertiser budget with user experience.
- 4.3.
- Contribution:
- Conducting surveys to gather diverse cultural and user-centric insights into acceptable advertising practices;
- Providing recommendations for optimizing advertisement duration and placement strategies to balance budget, user engagement, and video QoE, considering the identified cultural and behavioral factors.
2. Background and Related Work
2.1. Quality of Experience
2.2. ITU-T P.1203 Standard
2.3. Face Emotion Recognition (FER)
2.4. HTTP Adaptive Streaming (HAS)
2.5. QoE Influence Factors
2.6. QoE Metrics
- Source video quality—Content quality may be affected by the characteristics of the original video, such as codec type and video bitrate;
- QoS primarily considers how packet or video traffic chunks travel in the network from the source to the destination. Alternatively, technical details include packet loss, jitter, delay, and throughput;
- MOS or subjective QoE measurement involves the human perception or satisfaction level;
- Objective QoE measurement—This denotes assessment models for estimating/predicting subjective video quality services by extracting important QoE metrics, for example, examining the stalling frequency and stalling period.
2.7. QoE Assessment Types
3. Methodology
3.1. Video Watching Session
3.2. Data Collection and Storing
3.3. Combine All Data Approach
- P.1203 (ITU-T)—Parametric bitstream-based quality evaluation of progressive download and adaptive audiovisual streaming services over dependable transport;
- P.1203.1, ITU-T—Parametric bitstream-based quality evaluation of progressive download and adaptive audio–visual streaming services over a dependable transport video quality estimation module;
- ITU-T Rec. P.1203.2—Audio quality estimate module for metric bitstream-based quality evaluation of progressive download and adaptive audio–visual streaming services over dependable transport;
- ITU-T Rec. P.1203.3—Quality integration module for metric bitstream-based quality evaluation of progressive download and adaptive audio–visual streaming services over dependable transport.
3.4. Pre-Processing, Data Cleaning, Model Training and Evaluation
3.5. QoE Evaluation Result and Analysis
3.6. Analysis of Machine Learning Methodologies, Features Importance, and QoE Perceptions
4. Experimental Results
4.1. Survey Results and Statistics
4.2. Competing Approaches
4.3. Hardware and Software Setup
4.4. Evaluation
- Massive and intense ads may impact QoE and increase ITU results (i.e., higher bitrate, frame rate, and resolution), but this does not signify that star reviews given by participants will be high;
- Possible QoE IFs from our experimental results include general video content and ad factors (ad length, number of ad, ad location, ad relation to content, repeated ad, and maximum ad acceptance number);
- QoE was most impaired by mid-roll and unskippable ads (approximately 36.2% and 31.9%, respectively). The users found it acceptable to watch an ad of less than 10 s, which is approximately 41.67%.
5. Discussions and Future Directions
- Real live FER system—We tried real-time live system face emotion recognition in the real world, and found that it works effectively, although the emotion results do not yet drive the shape of traffic. The proposed method uses emotion-aware advertisement insertions. The shaping of traffic based on emotions to improve QoE will be the focus of future research;
- Computational complexity of the system—The computational complexity of the proposed method is dominated by facial emotion recognition (FER). The FER process involves several steps including face detection, which involves identifying the location of the face in a video frame. The computational complexity of face detection depends on the deep-face algorithm. The complexity can be written as O(n), where n is the number of pixels in a video frame. Feature extraction—This step involved the extraction of features from the face. The computational complexity of the feature extraction depends on the specific features used. Therefore, the complexity of a compound is O(f), where f denotes the number of extracted features. Motion classification—This step involves classifying the extracted features into one of seven basic emotions (happiness, sadness, anger, fear, surprise, disgust, and neutrality). The computational complexity of emotion classification depends on the classifier that is used. However, it can generally be considered as O (c), where c is the sum of the emotion classes;
- Therefore, the overall computational complexity of the FER process can be considered as O (n ⋅ f ⋅ c). Using the proposed method, the FER process was performed on each video frame. Therefore, the overall computational complexity of the proposed method is O (T ⋅ n ⋅ f ⋅ c), where T denotes the total number of video frames. For a video that is 30 s long and has a frame rate of 30 fps, T = 30 × 30 = 900. If the video frame is 640 × 480 pixels, then n = 640 × 480 = 307200. For f = 100, features were extracted from each face, c = 7, emotions were classified, and the overall computational complexity of the proposed method was O (900 ⋅ 307200 ⋅ 100 ⋅ 7) = 1.6 × 1012. All this results in relatively high computational complexity. However, it is important to note that the FER process can be parallelized using a GPU to reduce the computational cost of the proposed method significantly. It is important to note that the proposed method selects only the most relevant ads for a particular user. Once the most relevant ad is selected, we incorporate its location, type, and time. These ads can then be provided to users without further facial emotion recognition. Therefore, the overall computational impact of the proposed method was relatively small;
- Theoretical analysis of the proposed approach—The proposed Machine Learning (ML) approach for video Quality of Experience (QoE) inference, incorporating face emotion recognition, user feedback on ad insertion, and network conditions, was evaluated using a dataset of 50 recorded video streaming sessions. This dataset included viewers’ facial videos, network traffic logs, user feedback on ad insertion, and subjective QoE scores. The accuracy of the model was compared for two baselines—one utilizing only network conditions for QoE inference, ITU-T P.1203, and another employing only user feedback on ad insertion. The proposed approach consistently achieved a lower Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) than the baseline models, indicating superior accuracy in inferring QoE. This can be seen in Figure 8 and Table 9;
- Qualitative analysis revealed the model’s sensitivity to viewers’ facial expressions, particularly joy, surprise, and frustration, which are known indicators of positive and negative QoE. It also learns to identify advertisement placements perceived as disruptive by users by adjusting its QoE predictions accordingly. Moreover, the model effectively utilizes network bandwidth as a critical indicator of potential rebuffering and stalling, negatively impacting QoE. These experimental results convincingly demonstrate the effectiveness of the proposed ML approach in accurately inferring the video QoE. Its ability to integrate facial emotion recognition, user feedback on ad insertion, and network conditions provides a comprehensive understanding of the QoE. All this offers promising scope to improve user satisfaction and network performance in video streaming systems;
- Our hypothesis, which maps the extracted emotion to ACR and MOS, is based on the studies by Porcu et al. [36] and Martinez-Caro and Cano [49]. Porcu et al. [36] analyzed facial expressions and gaze direction. They achieved 93.9% accuracy by leveraging the k-NN classifier, investigating the possibility of estimating the perceived QoE using facial emotions and gaze movement. Martinez-Caro and Cano [46] utilized the ITU-T P.1203 model to estimate MOS values. They used variational algorithms to predict QoE and provided insights into the emotional impact of video quality;
- The reasons for choosing ITU-T P.1203 over ITU-T P.1204 in our research were carefully considered in light of the following issues. First, the ITU-T P.1203’s requirements are within our defined parameters, including a 1080p resolution and an H.264 codec. We experimented with these lower resolution and codec standards to simplify video data collection storage and analysis. Second, ITU-T P.1204 is more versatile and generally applicable to multimedia, but we required an assessment standard that is only sufficient for videos. Finally, the main objective of this study was to analyze the impacts of advertisements and video quality on user emotions. Therefore, we simplified this process by using the simpler quality assessment standard ITU-T P.1203. This is because ITU-T P.1204 requires more complex operations and pre-processing, which is beyond the scope of this study. Our study, which investigates the link between video content and viewer emotion, necessitates a specific video quality assessment standard that aligns with our research parameters. We deliberately selected ITU-T P.1203 because of its several advantages within the context of our study design, including direct compatibility issues, established expertise, focus on video quality, the technical expertise of ITU-T P.1203, and the industry standard for video quality. First, direct compatibility (ITU-T P.1203) was specifically designed for 1080p H.264 video content, eliminating the need for complex transcoding or compatibility adjustments. This ensures a smooth and efficient video analysis within the defined parameters. Second, expertise was established. Utilizing P.1203 enabled us to leverage the readily available resources and establish research practices. This enabled our team to focus on core research questions rather than expending resources to adapt and master different standards. Third, our focus was on the video quality. Although ITU-T P.1204 offers advanced capabilities, including support for higher resolutions and codecs, these features are beyond the scope of this study. Expanding the analysis to encompass diverse resolutions and codecs would introduce unnecessary complexity and potentially dilute the focus on our research objectives, which aim to understand the emotional impacts of video content within a well-defined parameter set. Fourth, ITU-T P.1203 has technical expertise in this field. ITU-T P.1203 incorporates sophisticated models and algorithms that simulate human visual perception by considering factors such as spatial and temporal characteristics. This focus on perceptual quality aligns perfectly with our research goals, ensuring that the obtained video quality measurements are directly related to viewers’ experience. The fifth reason relates to the industry standard for video quality. ITU-T P.1203 has seen significant adoption as a standardized method for video quality assessment in the industry and research communities. This ensured the consistency and comparability of our findings across different studies and implementations. Therefore, ITU-T P.1203 is the optimal choice for our video emotion research project. Its direct compatibility, established expertise, focus on video quality, and industry-standard implementation ensure efficient, reliable, and relevant analysis within the scope of this study. This selection allowed us to delve deeply into the emotional impact of video content, contributing significantly to the understanding of viewer experiences in current video streaming practices. While we acknowledge the potential value of ITU-T P.1204 to future research endeavors encompassing broader video content types or requiring analyses beyond the current limitations, P.1203 offers a balance of efficiency, expertise utilization, and focused analysis that is optimal for this specific research project;
- We acknowledge the presence of social elements in our study, particularly by observing user behavior and emotions. However, we firmly believe that this work presents significant research and technical challenges and contributes to video streaming QoE prediction. First, our approach utilizes automated facial emotion recognition (FER) algorithms, moving beyond subjective reports and providing an objective measure of user experience during video streams. This technical approach aligns with research exploring the link between facial expression and emotional responses to QoE. Second, is machine learning model development. We here went beyond measuring emotions. We trained a machine learning model that leverages these extracted features and other technical data points to predict QoE with enhanced accuracy. This technical innovation offers a data-driven and generalizable solution for improving the user experience in video streaming. Third, compared with the technical benchmark, we demonstrated the technical efficacy of our approach by achieving a 37.1% improvement in accuracy compared to the established ITU-T P.1203 standard, representing a significant technical advancement in QoE prediction. Moreover, we built our website as a unified platform to investigate video QoE inferences using multimodal input. In conclusion, while the study incorporates social elements, such as user observation, its core contribution lies in developing and evaluating a novel, technically grounded machine-learning model for objective QoE prediction using facial recognition. We believe that this work opens promising avenues for improving user engagement and experience in video streaming services;
- The practical implications of our findings are profound, particularly in the realm of video streaming services and advertisement placement strategies. By leveraging machine learning, facial emotion recognition, user feedback and ITU-T P.1203 results, our proposed framework offers valuable and tangible benefits for users, advertisers, and network providers. Our proposed framework not only improves accuracy but also sheds new light on the detrimental effects of advertisement on user experience. These new perspectives can inform network administrators and content providers regarding the significance of strategic ad placement to optimize overall QoE. Our study sets a foundation for advancements in user-driven video streaming services and advertisement strategies. By demonstrating the effectiveness of our proposed framework, this study opens the door for not only several real-world applications, but also some future research and development. We have shown that an effective QoE inference framework may enhance user experience, offer face-emotion-driven adaptive bitrate streaming capabilities, raise the possibility of targeted ad insertion based on user emotions, offer better content and ad recommendation systems, and allow improved QoE monitoring for network administrators in response to user emotion, as well as content creation with ad marketing. As future research and development directions, there are more possibilities to be addressed related to advanced emotion recognition models that focus on developing more sophisticated FER models that can recognize a wider range of more subtle expressions. In the future, privacy-preserving techniques to anonymize user data while maintaining effectiveness can be more effectively addressed. We also foresee a better chance to integrate multimodal physiological data that combine FER with other sources, such as heart rate, eye tracking or audio, to provide more comprehensive user experiences;
- The application and potential adaptation of our proposed framework can be explored as follows. First, in the human–computer interaction (HCI) field, our proposed solution can be integrated into HCI systems to provide more responsive and intuitive interactions, such as in virtual reality employing facial expression-driven systems. Second, emotion-aware learning environments can be developed that interpret students’ emotional states and attention during learning to adjust content, pace, and difficulty level. Third, we can foresee more emphatic and personalized care delivery in telemedicine and healthcare. Fourth, we could enhance retail and customer experience through videos that show facial expressions, in order to get better information on customer preferences, satisfaction, and engagement levels. Fifth, automotive companies can integrate facial emotion technology into their vehicles so as to improve driver safety and detect driver fatigue, sleepiness, and distraction. Sixth, artists and content creators can develop interactive immersive art performances with real-time facial expression feedback from users. Seventh, law-enforcement agencies could leverage models that detect deception in surveillance footage.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Abbreviation | Stands for |
---|---|
CCI | Correctly Classified Instances |
CNN | Convolutional Neural Network |
CV | Cross-Validation |
DL | Deep learning |
FER | Face Emotion Recognition |
HAS | HTTP Adaptive Streaming |
HTTP | Hypertext Transfer Protocol |
ICI | Incorrectly Classified Instances |
ITU-T | International Telecommunication Union–Telecommunication |
Mid-roll | Video advertisement in the middle of content playback |
MSE | Mean Squared Error |
Post-roll | Video advertisement at the end of content playback |
Pre-roll | Video advertisement before content playback started |
QoE | Quality of Experience |
QoS | Quality of Service |
UE | User Experience |
Weka | Waikato Environment for Knowledge Analysis |
Appendix B
|
Appendix C
ITU-res | FER | Star (Ground Truth) | Our Prediction |
---|---|---|---|
5 | 1 | 1 | 2 |
5 | 1 | 2 | 1 |
5 | 1 | 1 | 3 |
5 | 1 | 4 | 5 |
5 | 1 | 4 | 5 |
5 | 1 | 5 | 4 |
5 | 3 | 1 | 3 |
5 | 3 | 5 | 4 |
5 | 5 | 4 | 4 |
5 | 5 | 4 | 4 |
5 | 3 | 3 | 3 |
5 | 3 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 3 | 4 | 4 |
5 | 3 | 4 | 4 |
5 | 2 | 5 | 5 |
5 | 2 | 2 | 2 |
5 | 2 | 5 | 5 |
5 | 3 | 3 | 3 |
5 | 2 | 4 | 4 |
5 | 3 | 3 | 3 |
5 | 2 | 5 | 5 |
5 | 2 | 2 | 2 |
5 | 3 | 2 | 2 |
5 | 3 | 4 | 4 |
5 | 2 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 4 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 3 | 3 | 3 |
5 | 2 | 4 | 4 |
5 | 2 | 4 | 4 |
5 | 3 | 3 | 3 |
5 | 2 | 4 | 4 |
5 | 2 | 5 | 5 |
5 | 2 | 4 | 4 |
5 | 2 | 5 | 5 |
5 | 2 | 4 | 4 |
5 | 2 | 4 | 4 |
5 | 2 | 1 | 1 |
5 | 2 | 1 | 1 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
2 | 3 | 2 | 2 |
2 | 3 | 3 | 3 |
2 | 3 | 4 | 4 |
2 | 3 | 1 | 1 |
5 | 3 | 5 | 5 |
5 | 4 | 4 | 4 |
5 | 4 | 2 | 2 |
5 | 4 | 3 | 3 |
5 | 1 | 4 | 4 |
5 | 1 | 5 | 5 |
5 | 2 | 2 | 2 |
5 | 2 | 2 | 2 |
5 | 2 | 5 | 5 |
5 | 2 | 1 | 1 |
5 | 1 | 3 | 3 |
4 | 2 | 5 | 5 |
4 | 1 | 5 | 5 |
4 | 2 | 4 | 4 |
4 | 1 | 5 | 5 |
4 | 1 | 3 | 3 |
5 | 3 | 4 | 4 |
5 | 3 | 1 | 1 |
5 | 3 | 4 | 4 |
5 | 3 | 1 | 1 |
5 | 3 | 4 | 4 |
5 | 3 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
4 | 1 | 5 | 5 |
4 | 2 | 4 | 4 |
3 | 1 | 5 | 5 |
3 | 3 | 5 | 5 |
3 | 3 | 5 | 5 |
5 | 2 | 4 | 4 |
5 | 3 | 3 | 3 |
5 | 3 | 4 | 4 |
5 | 3 | 3 | 3 |
5 | 2 | 4 | 4 |
5 | 2 | 4 | 4 |
5 | 4 | 4 | 4 |
5 | 3 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 3 | 4 | 4 |
5 | 3 | 4 | 4 |
5 | 3 | 4 | 4 |
5 | 3 | 4 | 4 |
5 | 3 | 4 | 4 |
5 | 5 | 5 | 5 |
5 | 4 | 3 | 3 |
5 | 4 | 4 | 4 |
5 | 4 | 5 | 5 |
5 | 4 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 2 | 3 | 3 |
5 | 2 | 5 | 5 |
5 | 2 | 4 | 4 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 3 | 3 | 3 |
5 | 5 | 4 | 4 |
5 | 4 | 5 | 5 |
5 | 4 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 2 | 4 | 4 |
5 | 5 | 5 | 5 |
5 | 4 | 5 | 5 |
5 | 4 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 2 | 3 | 3 |
5 | 2 | 3 | 3 |
5 | 2 | 3 | 3 |
5 | 2 | 3 | 3 |
5 | 2 | 3 | 3 |
4 | 3 | 4 | 4 |
4 | 3 | 4 | 4 |
4 | 3 | 5 | 5 |
4 | 3 | 5 | 5 |
4 | 1 | 5 | 5 |
5 | 2 | 4 | 4 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 3 | 5 | 5 |
3 | 1 | 5 | 5 |
5 | 1 | 4 | 4 |
5 | 1 | 4 | 4 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 3 | 5 | 5 |
5 | 2 | 4 | 4 |
4 | 2 | 3 | 3 |
4 | 1 | 5 | 5 |
4 | 1 | 5 | 5 |
4 | 1 | 5 | 5 |
4 | 1 | 5 | 5 |
4 | 2 | 5 | 5 |
5 | 1 | 3 | 3 |
5 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 4 | 5 | 5 |
4 | 4 | 5 | 5 |
5 | 2 | 3 | 3 |
5 | 2 | 5 | 5 |
5 | 1 | 5 | 5 |
5 | 1 | 2 | 2 |
5 | 3 | 5 | 5 |
5 | 1 | 4 | 4 |
4 | 1 | 5 | 5 |
5 | 1 | 5 | 5 |
4 | 1 | 4 | 4 |
4 | 1 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 2 | 5 | 5 |
5 | 2 | 3 | 3 |
5 | 2 | 3 | 3 |
5 | 2 | 4 | 4 |
5 | 5 | 5 | 5 |
5 | 5 | 4 | 4 |
5 | 5 | 5 | 5 |
5 | 5 | 4 | 4 |
5 | 5 | 5 | 5 |
4 | 1 | 4 | 4 |
4 | 1 | 4 | 4 |
4 | 1 | 4 | 4 |
4 | 1 | 4 | 4 |
4 | 1 | 4 | 4 |
4 | 3 | 4 | 4 |
4 | 2 | 5 | 5 |
4 | 2 | 5 | 5 |
4 | 2 | 4 | 4 |
4 | 1 | 3 | 3 |
4 | 1 | 3 | 3 |
4 | 1 | 3 | 3 |
4 | 1 | 4 | 4 |
4 | 1 | 5 | 5 |
4 | 1 | 5 | 5 |
5 | 1 | 4 | 4 |
5 | 2 | 3 | 3 |
5 | 2 | 4 | 4 |
5 | 4 | 3 | 3 |
5 | 1 | 4 | 4 |
5 | 3 | 3 | 3 |
5 | 1 | 2 | 2 |
5 | 2 | 4 | 4 |
5 | 2 | 3 | 3 |
5 | 3 | 3 | 3 |
5 | 1 | 4 | 4 |
5 | 1 | 4 | 4 |
5 | 1 | 4 | 4 |
5 | 1 | 4 | 4 |
5 | 1 | 4 | 4 |
5 | 1 | 2 | 2 |
5 | 3 | 1 | 1 |
5 | 3 | 1 | 1 |
5 | 3 | 1 | 1 |
5 | 3 | 1 | 1 |
References
- Gutterman, C.; Guo, K.; Arora, S.; Wang, X.; Wu, L.; Katz-Bassett, E.; Zussman, G. Requet: Real-Time Quantitative Detection for Encrypted YouTube Traffic. In Proceedings of the 10th ACM Multimedia System Conference, Amherst, MA, USA, 18–21 June 2019. [Google Scholar]
- Izima, O.; de Fréin, R.; Malik, A. A survey of machine learning techniques for video quality prediction from quality of delivery metrics. Electronics 2021, 10, 2851. [Google Scholar] [CrossRef]
- Bouraqia, K.; Sabir, E.; Sadik, M.; Ladid, L. Quality of experience for streaming services: Measurements, challenges and insights. IEEE Access 2020, 8, 13341–13361. [Google Scholar] [CrossRef]
- Agboma, F.; Liotta, A. QoE-Aware QoS Management. In Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia, Linz, Austria, 24–26 November 2008. [Google Scholar]
- Streijl, R.C.; Winkler, S.; Hands, D.S. Mean Opinion Score (MOS) Revisited: Methods and Applications, Limitations and Alternatives. Multimed. Syst. 2016, 22, 213–227. [Google Scholar] [CrossRef]
- Engelke, U.; Darcy, D.P.; Mulliken, G.H.; Bosse, S.; Martini, M.G.; Arndt, S.; Antons, J.N.; Chan, K.Y.; Ramzan, N.; Brunnström, K. Psychophysiology-Based QoE Assessment: A Survey. IEEE J. Sel. Top. Signal Process. 2016, 11, 6–21. [Google Scholar] [CrossRef]
- Raake, A.; Garcia, M.N.; Robitza, W.; List, P.; Göring, S.; Feiten, B. A Bitstream-Based, Scalable Video-Quality Model for HTTP Adaptive Streaming: ITU-T P.1203.1. In Proceedings of the 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), IEEE, Erfurt, Germany, 31 May–2 June 2017; pp. 1–6. [Google Scholar]
- Garcia, M.-N.; Dytko, D.; Raake, A. Quality Impact Due to Initial Loading, Stalling, and Video Bitrate in Progressive Download Video Services. In Proceedings of the 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX), Singapore, 18–20 September 2014; pp. 129–134. [Google Scholar]
- Pereira, R.; Pereira, E.G. Dynamic Adaptive Streaming over HTTP and Progressive Download: Comparative Considerations. In Proceedings of the 2014 28th International Conference on Advanced Information Networking and Applications Workshops, IEEE, Victoria, BC, Canada, 13–16 May 2014; pp. 905–909. [Google Scholar]
- Sackl, A.; Zwickl, P.; Reichl, P. The trouble with choice: An empirical study to investigate the influence of charging strategies and content selection on QoE. In Proceedings of the 9th International Conference on Network and Service Management (CNSM 2013), Zurich, Switzerland, 14–18 October 2013; pp. 298–303. [Google Scholar]
- Hoßfeld, T.; Seufert, M.; Hirth, M.; Zinner, T.; Tran-Gia, P.; Schatz, R. Quantification of YouTube QoE via crowdsourcing. In Proceedings of the 2011 IEEE International Symposium on Multimedia, Dana Point, CA, USA, 5–7 December 2011; pp. 494–499. [Google Scholar]
- Oyman, O.; Singh, S. Quality of experience for HTTP adaptive streaming services. IEEE Commun. Mag. 2012, 50, 20–27. [Google Scholar] [CrossRef]
- Yao, J.; Kanhere, S.S.; Hossain, I.; Hassan, M. Empirical evaluation of HTTP adaptive streaming under vehicular mobility. In Proceedings of the International Conference on Research in Networking, Madrid, Spain, 14–16 November 2011; pp. 92–105. [Google Scholar]
- Ghani, R.F.; Ajrash, A.S. Quality of Experience Metric of Streaming Video: A Survey. Iraqi J. Sci. 2018, 59, 1531–1537. [Google Scholar]
- Porcu, S.; Floris, A.; Atzori, L. Towards the Prediction of the Quality of Experience from Facial Expression and Gaze Direction. In Proceedings of the 2019 22nd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN), IEEE, Paris, France, 19–21 February 2019; pp. 82–87. [Google Scholar]
- Akhshabi, S.; Anantakrishnan, L.; Begen, A.C.; Dovrolis, C. What happens when adative streaming players compete for bandwidth? In Proceedings of the 22nd International Workshop on Network and Operating System Support for Digital Audio and Video, Toronto, ON, Canada, 7–8 June 2012; pp. 9–14. [Google Scholar]
- Zinner, T.; Hossfeld, T.; Minhas, T.N.; Fiedler, M. Controlled vs. uncontrolled degradations of QoE: The provisioning-delivery hysteresis in case of video. In Proceedings of the EuroITV 2010 Workshop: Quality of Experience for Multimedia Content Sharing, Tampere, Finland, 9–11 June 2010. [Google Scholar]
- Cohen, W.W. Fast Effective Rule Induction. In Machine Learning Proceedings 1995; Elsevier: Amsterdam, The Netherlands, 1995; pp. 115–123. [Google Scholar]
- Landis, J.R.; Koch, G.G. An Application of Hierarchical Kappa-Type Statistics in the Assessment of Majority Agreement among Multiple Observers. Biometrics 1977, 33, 363–374. [Google Scholar] [CrossRef]
- Bermudez, H.F.; Martinez-Caro, J.M.; Sanchez-Iborra, R.; Arciniegas, J.L.; Cano, M.D. Live Video-Streaming Evaluation Using the ITU-T P.1203 QoE Model in LTE Networks. Comput. Netw. 2019, 165, 106967. [Google Scholar] [CrossRef]
- Callet, P.; Möller, S.; Perkis, A. Qualinet White Paper on Definitions of Quality of Experience. In Proceedings of the European Network on Quality of Experience in Multimedia Systems and Services 2013, Novi Sad, Serbia, 12 March 2013. [Google Scholar]
- Amour, L.; Boulabiar, M.I.; Souihi, S.; Mellouk, A. An Improved QoE Estimation Method Based on QoS and Affective Computing. In Proceedings of the 2018 International Symposium on Programming and Systems (ISPS), Algiers, Algeria, 24–26 April 2018. [Google Scholar]
- Bhattacharya, A.; Wu, W.; Yang, Z. Quality of Experience Evaluation of Voice Communication: An Affect-Based Approach. Hum.-Centric Comput. Inf. Sci. 2012, 2, 7. [Google Scholar]
- Porcu, S.; Uhrig, S.; Voigt-Antons, J.N.; Möller, S.; Atzori, L. Emotional Impact of Video Quality: Self-Assessment and Facial Expression Recognition. In Proceedings of the 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), Berlin, Germany, 5–7 June 2019. [Google Scholar]
- Antons, J.N.; Schleicher, R.; Arndt, S.; Moller, S.; Porbadnigk, A.K.; Curio, G. Analyzing Speech Quality Perception Using Electroencephalography. IEEE J. Sel. Top. Signal Process. 2012, 6, 721–731. [Google Scholar] [CrossRef]
- Kroupi, E.; Hanhart, P.; Lee, J.S.; Rerabek, M.; Ebrahimi, T. EEG Correlates During Video Quality Perception. In Proceedings of the 2014 22nd European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 1–5 September 2014. [Google Scholar]
- Arndt, S.; Antons, J.N.; Schleicher, R.; Möller, S. Using Electroencephalography to Analyze Sleepiness Due to Low-Quality Audiovisual Stimuli. Signal Process. Image Commun. 2016, 42, 120–129. [Google Scholar] [CrossRef]
- Arndt, S.; Radun, J.; Antons, J.N.; Möller, S. Using Eye-Tracking and Correlates of Brain Activity to Predict Quality Scores. In Proceedings of the 2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX), Singapore, 18–20 September 2014. [Google Scholar]
- Engelke, U.; Pepion, R.; Le Callet, P.; Zepernick, H.J. Linking Distortion Perception and Visual Saliency in H. 264/AVC Coded Video Containing Packet Loss. Visual Commun. Image Process. 2010, 7744, 59–68. [Google Scholar]
- Rai, Y.; Le Callet, P. Do Gaze Disruptions Indicate the Perceived Quality of Nonuniformly Coded Natural Scenes? Electron. Imaging 2017, 14, 104–109. [Google Scholar]
- Rai, Y.; Barkowsky, M.; Le Callet, P. Role of Spatio-Temporal Distortions in the Visual Periphery in Disrupting Natural Attention Deployment. Electron. Imaging 2016, 16, 1–6. [Google Scholar] [CrossRef]
- Bailenson, J.N.; Pontikakis, E.D.; Mauss, I.B.; Gross, J.J.; Jabon, M.E.; Hutcherson, C.A.; Nass, C.; John, O. Real-time Classification of Evoked Emotions Using Facial Feature Tracking and Physiological Responses. Int. J. Hum.-Comput. Stud. 2008, 66, 303–317. [Google Scholar]
- Robitza, W.; Göring, S.; Raake, A.; Lindegren, D.; Heikkilä, G.; Gustafsson, J.; List, P.; Feiten, B.; Wüstenhagen, U.; Garcia, M.N.; et al. HTTP Adaptive Streaming QoE Estimation with ITU-T Rec. P.1203: Open Databases and Software. In Proceedings of the 9th ACM Multimedia Systems Conference 2018, Amsterdam, The Netherlands, 12–15 June 2018; pp. 466–471. [Google Scholar]
- International Telecommunication Union. Recommendation ITU-T P.1203.3, Parametric Bitstream-Based Quality Assessment of Progressive Download and Adaptive Audiovisual Streaming Services over Reliable Transport-Quality Integration Module. 2017. Available online: https://www.itu.int/rec/T-REC-P.1203.3/en (accessed on 28 March 2024).
- Bentaleb, A.; Taani, B.; Begen, A.C.; Timmerer, C.; Zimmermann, R. A survey on bitrate adaptation schemes for streaming media over HTTP. IEEE Commun. Surv. Tutor. 2018, 21, 562–585. [Google Scholar] [CrossRef]
- Porcu, S. Estimation of the QoE for Video Streaming Services Based on Facial Expressions and Gaze Direction. 2021. Available online: https://iris.unica.it/handle/11584/308985 (accessed on 28 March 2024).
- Roettgers, J. Don’t touch that dial: How YouTube is bringing adaptive streaming to mobile, TVs. 2013. Available online: http://finance.yahoo.com/news/don-t-touch-dial-youtube-224155787.html (accessed on 28 March 2024).
- Seufert, M.; Egger, S.; Slanina, M.; Zinner, T.; Hoßfeld, T.; Tran-Gia, P. A survey on quality of experience of HTTP adaptive streaming. IEEE Commun. Surv. Tutor. 2014, 17, 469–492. [Google Scholar] [CrossRef]
- Barman, N.; Martini, M.G. QoE Modeling for HTTP Adaptive Video Streaming—A Survey and Open Challenges. IEEE Access 2019, 7, 30831–30859. [Google Scholar] [CrossRef]
- Wang, W.; Lu, Y. Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model. IOP Conf. Ser. Mater. Sci. Eng. 2018, 324, 012049. [Google Scholar] [CrossRef]
- Seshadrinathan, K.; Soundararajan, R.; Bovik, A.C. Study of Subjective and Objective Quality Assessment of Video. IEEE Trans. Image Process. 2010, 19, 1427–1441. [Google Scholar] [CrossRef]
- Im, S.-K.; Chan, K.-H. Dynamic estimator selection for double-bit-range estimation in VVC CABAC entropy coding. IET Image Process. 2024, 18, 722–730. [Google Scholar] [CrossRef]
- Chan, K.-H.; Im, S.-K. Using four hypothesis probability estimators for CABAC in versatile video coding. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 19, 1–17. [Google Scholar] [CrossRef]
- Cofano, G.; De Cicco, L.; Zinner, T.; Nguyen-Ngoc, A.; Tran-Gia, P.; Mascolo, S. Design and experimental evaluation of network-assisted strategies for HTTP adaptive streaming. In Proceedings of the 7th International Conference on Multimedia Systems, Klagenfurt, Austria, 10–13 May 2016; pp. 1–12. [Google Scholar]
- Han, L.; Yu, L. A Variance Reduction Framework for Stable Feature Selection. Stat. Anal. Data Min. ASA Data Sci. J. 2012, 5, 428–445. [Google Scholar] [CrossRef]
- Strobl, C.; Boulesteix, A.L.; Kneib, T.; Augustin, T.; Zeileis, A. Conditional Variable Importance for Random Forests. BMC Bioinform. 2008, 9, 307. [Google Scholar] [CrossRef] [PubMed]
- Altmann, A.; Toloşi, L.; Sander, O.; Lengauer, T. Permutation Importance: A Corrected Feature Importance Measure. Bioinformatics 2010, 26, 1340–1347. [Google Scholar] [CrossRef] [PubMed]
- Menze, B.H.; Kelm, B.M.; Masuch, R.; Himmelreich, U.; Bachert, P.; Petrich, W.; Hamprecht, F.A. A Comparison of Random Forest and Its Gini Importance with Standard Chemometric Methods for the Feature Selection and Classification of Spectral Data. BMC Bioinform. 2009, 10, 213. [Google Scholar] [CrossRef]
- Martinez-Caro, J.-M.; Cano, M.-D. On the Identification and Prediction of Stalling Events to Improve QoE in Video Streaming. Electronics 2021, 10, 753. [Google Scholar] [CrossRef]
Content Title | Content Length (s) | Number of Ad | Length of Ad | Position of Ad |
---|---|---|---|---|
Expo 2020 Dubai | 280 | 1 | 18 | Post-roll |
Squid game2 | 113 | 1 | 30 | Pre-roll |
Every death game SG | 375 | 1 | 18 | Mid-roll |
5 metaverse | 461 | 3 | 75 | Pre-roll, mid-roll, post-roll |
Created Light from Trash | 297 | 2 | 45 | Pre-roll |
How this guy found a stolen car! | 171 | 6 | 288 | Pre-roll, mid-roll, post-roll |
First underwater farm | 233 | 6 | 198 | Pre-roll, mid-roll, post-roll |
Most beautiful building in the world | 166 | 6 | 292 | Mid-roll |
This is made of...pee?! | 78 | 4 | 418 | Pre-roll |
The most unexplored place in the world | 256 | 5 | 391 | Post-roll |
Jeda Rodja 1 | 387 | 8 | 279 | Pre-roll |
Jeda Rodja 2 | 320 | 8 | 440 | Pre-roll, mid-roll, post-roll |
Jeda Rodja 3 | 415 | 6 | 272 | Pre-roll, mid-roll, post-roll |
Jeda Rodja 4 | 371 | 6 | 311 | Post-roll |
Jeda Rodja 5 | 376 | 6 | 311 | Mid-roll |
Reference | Influence Factors | Considered Features |
---|---|---|
Amour et al. [22] | Resolution, bandwidth, delay | Face emotion |
Bhattacharya et al. [23] | Delay, packet loss, bandwidth | Acoustic feature |
Porcu et al. [15] | Delay, stalling | Face and gaze tracking information |
Porcu et al. [24] | Blurring | Face and gaze tracking information |
Antons et al. [25] | Noise signal | EEG |
Kroupi et al. [26] | High quantization attribute | EEG |
Arndt et al. [27] | Low bitrate encoding | EEG and EOG |
Arndt et al. [28] | Low bitrate encoding | EEG and gaze movement information |
Engelke et al. [29] | Packet loss | Gaze movement information |
Rai et al. [30] | High quantization attribute | Gaze movement information |
Rai et al. [31] | Packet loss | Gaze movement information |
Bailenson et al. [32] | Provoked delight and anxiety | Face emotion and 15 physiological features |
Our Proposed Work | Video advertisement | Face emotion, video metadata and advertisement information |
Grading Value | Emotion |
---|---|
5 | Happy |
4 | Surprised |
3 | Neutral |
2 | Sad, fear |
1 | Disgust, anger |
Resolution | Bitrate | ITU-T P.1203 Results | Video Content Length | Star Review |
---|---|---|---|---|
1080 | 8000 | 5 | 301 | 1 |
1080 | 8000 | 5 | 301 | 2 |
1080 | 8000 | 5 | 301 | 1 |
1080 | 8000 | 5 | 301 | 4 |
1080 | 8000 | 5 | 301 | 4 |
720 | 5000 | 5 | 301 | 5 |
720 | 5000 | 5 | 303 | 1 |
Grade | Estimated Quality | Estimated Emotion |
---|---|---|
5 | Excellent | Happy |
4 | Good | Surprise |
3 | Fair | Neutral |
2 | Poor | Sad |
1 | Bad | Disgust, anger, fear |
The Most Annoying Advertisement Type from 122 Participants | ||
---|---|---|
Case | Participants | Percentage (%) |
Many repeated advertisements at 1 point in time in mid-roll | 22 | 18.03% |
Single 5 min advertisement long in mid-roll | 22 | 18.03% |
In 5 min of video content, every 1 min, there is one repeated ad | 21 | 17.21% |
The same advertisement is repeated in pre-, mid-, and post-roll | 18 | 14.75% |
There is no skippable advertisement | 39 | 31.97% |
Total | 122 | 100% |
Amount | Types |
---|---|
661 | Total participants from around the world |
114 | Countries and cities |
125 | Completed questionnaires |
30 | Questionnaires were completed with video recordings |
The Maximum Acceptable Advert Length Period | |
---|---|
Time | Participants |
<10 s | 41.67% |
10–30 s | 37.5% |
Title | Res | Bitrate | ITU | FER | Cont. Length | Ad. Count | Long. ad | Ad. loc | Repeat | 5min. len.ad | Ad.each.min | p/m/p same ad | No Skip ad | Star |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Stolen_car | 720 | 5000 | 5 | 3 | 459 | 6 | 288 | 4 | 0 | 0 | 1 | 0 | 1 | 1 |
Underwater_farm | 720 | 5000 | 5 | 3 | 431 | 6 | 198 | 4 | 1 | 0 | 0 | 1 | 1 | 2 |
beautiful_building | 720 | 5000 | 5 | 3 | 608 | 6 | 442 | 2 | 0 | 0 | 0 | 0 | 1 | 1 |
Made_of_pee | 720 | 5000 | 5 | 3 | 496 | 4 | 418 | 1 | 0 | 0 | 0 | 0 | 1 | 4 |
Unexplored_place | 720 | 5000 | 5 | 3 | 433 | 4 | 177 | 3 | 0 | 0 | 0 | 0 | 1 | 4 |
Amount | Attributes |
---|---|
16 | user.id, video.id, resolution, bitrate, fer, content.length, ad.loc, ad.count, long.5min.ad, ad.each.min, pre.mid.post.same.id, repeated.ad, no.skip, stars |
15 | user.id, video.id, itu, resolution, bitrate, fer, content.length, ad.loc, ad.count, long.5min.ad, ad.each.min, pre.mid.post.same.id, repeated.ad, no.skip, stars |
8 Selected Attributes Using Symmetrical Uncert Attribute Eval, Ranker, 10-fold cross-validation | video.id, fer, content.length, ad.loc, long.5min.ad, ad.each.min, pre.mid.post.same.id, repeated.ad, stars |
9 Selected Attributes Using Relief F Attribute Eval, Ranker, 10-fold cross-validation | video.id, content.length, ad.loc, ad.count, long.ad, long.5min.ad, ad.each.min, resolution, bitrate, stars |
13 Selected Attributes Using Correlation Attribute Eval Using Ranker, 10-fold cross-validation | user.id, video.id, itu, resolution, bitrate, fer, content.length, ad.loc, ad.count, long.5min.ad, ad.each.min, pre.mid.post.same.id, repeated.ad, stars |
ML Method | Naïve Bayes Updateable | Multi-Layer Perceptron CS | Meta Random Subspace | ||||||
---|---|---|---|---|---|---|---|---|---|
Test Types | 94:6 | 60-Fold | Train Set | 94:6 | 60-Fold | Train Set | 94:6 | 60-Fold | Train Set |
CCI 1 | 69.23% | 46.05% | 59.53% | 61.54% | 51.63% | 100.00% | 61.54% | 52.56% | 60% |
ICI 2 | 30.77% | 53.95% | 40.47% | 38.46% | 48.37% | 0.00% | 38.46% | 47.44% | 39.53% |
RMSE | 0.3476 | 0.3914 | 0.3363 | 0.3873 | 0.4068 | 0.0121 | 0.3526 | 0.3585 | 32.64% |
Total Instances | 13 | 215 | 215 | 13 | 215 | 215 | 13 | 215 | 215 |
Precision | N/A 3 | 0.493 | 0.635 | N/A | 0.503 | 1 | N/A | 0.612 | N/A |
Recall | 0.692 | 0.46 | 0.595 | 0.615 | 0.516 | 1 | 0.615 | 0.526 | 0.605 |
ML Method | Random Forest | CHIRP | Multiclass Classifier | ||||||
Test Types | 94:6 | 60-fold | Train Set | 94:6 | 60-fold | Train Set | 94:6 | 60-fold | Train Set |
CCI | 69.23% | 57.67% | 100.00% | 76.92% | 46.98% | 95% | 76.92% | 47.91% | 77.67% |
ICI | 30.77% | 42.33% | 0.00% | 23.08% | 53.02% | 5% | 23.08% | 52.09% | 22.33% |
RMSE | 0.3351 | 0.3495 | 0.1306 | 0.3038 | 0.4605 | 0.1431 | 0.3044 | 0.3881 | 0.2362 |
Total Instances | 13 | 215 | 215 | 13 | 215 | 215 | 13 | 215 | 215 |
Precision | N/A | 0.568 | 1 | N/A | 0.449 | 0.95 | 0.791 | 0.502 | 0.778 |
Recall | 0.962 | 0.577 | 1 | 0.769 | 0.47 | 0.949 | 0.769 | 0.479 | 0.777 |
ML Method | Meta Decorate | SMO | Furia | ||||||
Test Types | 94:6 | 60-fold | Train Set | 94:6 | 60-fold | Train Set | 94:6 | 60-fold | Train Set |
CCI | 69.23% | 42.79% | 95.35% | 69.23% | 53.02% | 72.09% | 46.15% | 56.28% | 57.67% |
ICI | 30.77% | 57.21% | 4.65% | 30.77% | 46.98% | 27.91% | 53.85% | 43.72% | 42.33% |
RMSE | 0.3158 | 0.3707 | 0.1948 | 0.3658 | 0.3642 | 0.3373 | 0.4549 | 0.3672 | 0.3417 |
Total Instances | 13 | 215 | 215 | 13 | 215 | 215 | 13 | 215 | 215 |
Precision | N/A | 0.435 | 0.955 | N/A | 0.519 | 0.75 | N/A | N/A | N/A |
Recall | 0.692 | 0.428 | 0.953 | 0.692 | 0.53 | 0.721 | 0.462 | 0.563 | 0.577 |
8 Selected Attribute Using Symmetrical Uncert Attribute Eval, Ranker, 10-Fold Cross-Validation | ||||||||
---|---|---|---|---|---|---|---|---|
Random Forest | Furia | Jrip | Meta Decorate | |||||
Test Types | 60-Fold | 94:6 | 60-Fold | 94:6 | 60-Fold | 94:6 | 60-Fold | 94:6 |
CCI | 53.85% | 33.02% | 45.58% | 53.85% | 45.58% | 53.85% | 48.37% | 53.85% |
ICI | 46.15% | 66.98% | 54.42% | 46.15% | 54.42% | 46.15% | 51.63% | 46.15% |
RMSE | 0.3698 | 0.3865 | 0.4197 | 0.4136 | 0.37 | 0.3774 | 0.3666 | 0.3578 |
Total Instances | 13 | 215 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | N/A | 0.298 | N/A | N/A | 0.543 | N/A | 0.491 | N/A |
Recall | 0.538 | 0.33 | 0.456 | 0.538 | 0.456 | 0.538 | 0.484 | 0.538 |
SMO | Tree SPAARC | Tree Optimized Forest | Local KNN | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 48.84% | 53.85% | 42.33% | 46.15% | 34.88% | 53.85% | 16.74% | 15.38% |
ICI | 51.16% | 46.15% | 57.67% | 53.85% | 65.12% | 46.15% | 83.26% | 84.62% |
RMSE | 0.3705 | 0.3823 | 0.3749 | 0.3845 | 0.4157 | 0.3724 | 0.396 | 0.4038 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.406 | N/A | N/A | N/A | 0.328 | N/A | 0.302 | N/A |
Recall | 0.488 | 0.538 | 0.423 | 0.462 | 0.349 | 0.538 | 0.167 | 0.154 |
Multi-Layer Perceptron | Naïve Bayes | Chirp | Multi Class Classifier | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 38.60% | 53.85% | 43.26% | 53.85% | 31.63% | 38.46% | 44.65% | 53.85% |
ICI | 61.40% | 46.15% | 56.74% | 46.15% | 68.37% | 61.54% | 55.35% | 46.15% |
RMSE | 0.3867 | 0.3637 | 0.3872 | 0.373 | 0.523 | 0.4961 | 0.3762 | 0.378 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.375 | N/A | 0.416 | N/A | 0.32 | N/A | 0.428 | N/A |
Recall | 0.386 | 0.538 | 0.433 | 0.538 | 0.316 | 0.358 | 0.447 | 0.538 |
9 Selected Attributes Using Relief F Attribute Eval, Ranker, 10-fold cross-validation | ||||||||
Random Forest | Furia | Jrip | Meta Decorate | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 37.21% | 46.15% | 53.85% | 47.44% | 47.44% | 53.85% | 46.98% | 61.54% |
ICI | 62.79% | 53.85% | 46.15% | 52.56% | 52.56% | 46.15% | 53.02% | 38.46% |
RMSE | 0.3909 | 0.3953 | 0.3778 | 0.3993 | 0.3666 | 0.3759 | 0.3741 | 0.3603 |
Total Instances | 215 | 13 | 13 | 215 | 21 | 13 | 215 | 13 |
Precision | 0.375 | N/A | N/A | N/A | N/A | N/A | 0.461 | N/A |
Recall | 0.338 | 0.462 | 0.538 | 0.474 | 0.474 | 0.538 | 0.47 | 0.615 |
SMO | Tree SPAARC | Tree Optimized Forest | Local KNN | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 48.84% | 53.85% | 41.40% | 46.15% | 35.81% | 38.46% | 46.05% | 7.69% |
ICI | 51.16% | 46.15% | 58.60% | 53.85% | 64.19% | 61.54% | 53.95% | 92.31% |
RMSE | 0.3683 | 0.3783 | 0.3756 | 0.3845 | 0.4162 | 0.4009 | 0.373 | 0.4175 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.414 | N/A | N/A | N/A | 0.335 | N/A | 0.379 | N/A |
Recall | 0.488 | 0.538 | 0.414 | 0.462 | 0.338 | 0.385 | 0.46 | 0.077 |
Multi-Layer Perceptron | Naïve Bayes | Chirp | Multi Class Classifier | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 42.33% | 61.54% | 39.07% | 61.54% | 34.42% | 30.77% | 46.05% | 61.54% |
ICI | 57.67% | 38.46% | 60.93% | 38.46% | 65.58% | 69.23% | 53.95% | 38.46% |
RMSE | 0.3807 | 0.3661 | 0.4029 | 0.3605 | 0.5122 | 0.5262 | 0.373 | 0.3706 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.379 | N/A | 0.398 | N/A | 0.393 | N/A | 0.379 | N/A |
Recall | 0.423 | 0.615 | 0.391 | 0.615 | 0.344 | 0.308 | 0.46 | 0.615 |
13 Selected Attributes Using Correlation Attribute Eval Using Ranker, 10-fold cross-validation | ||||||||
Random Forest | Furia | Jrip | Meta Decorate | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 30.70% | 46.15% | 48.37% | 53.85% | 46.98% | 53.85% | 44.19% | 53.85% |
ICI | 69.30% | 53.85% | 51.63% | 46.15% | 53.02% | 46.15% | 55.81% | 46.15% |
RMSE | 0.4179 | 0.4119 | 0.4163 | 0.4033 | 0.3671 | 0.3759 | 0.3715 | 0.3598 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.297 | N/A | N/A | N/A | N/A | N/A | 0.408 | N/A |
Recall | 0.307 | 0.462 | 0.484 | 0.538 | 0.47 | 0.538 | 0.442 | 0.538 |
SMO | Tree SPAARC | Tree Optimized Forest | Local KNN | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 47.44% | 53.85% | 44.65% | 46.15% | 31.63% | 38.46% | 34.88% | 30.77% |
ICI | 52.56% | 46.15% | 55.35% | 53.85% | 68.37% | 61.54% | 65.12% | 69.23% |
RMSE | 0.3689 | 0.3783 | 0.3708 | 0.3845 | 0.4293 | 0.4234 | 0.4335 | 0.4356 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.405 | N/A | N/A | N/A | 0.331 | 0.346 | 0.405 | 0.354 |
Recall | 0.474 | 0.538 | 0.447 | 0.462 | 0.316 | 0.385 | 0.349 | 0.308 |
Multi-Layer Perceptron | Naïve Bayes Simple | Chirp | Multi Class Classifier | |||||
Test Types | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 | 60-fold | 94:6 |
CCI | 33.95% | 53.85% | 38.14% | 46.15% | 36.74% | 38.46% | 44.19% | 46.15% |
ICI | 66.05% | 46.15% | 61.86% | 53.85% | 63.26% | 61.54% | 55.81% | 53.85% |
RMSE | 0.4224 | 0.3887 | 0.4093 | 0.3698 | 0.503 | 0.4961 | 0.3783 | 0.376 |
Total Instances | 215 | 13 | 215 | 13 | 215 | 13 | 215 | 13 |
Precision | 0.322 | N/A | 0.394 | N/A | 0.339 | N/A | 0.407 | N/A |
Recall | 0.34 | 0.538 | 0.381 | 0.462 | 0.367 | 0.385 | 0.442 | 0.462 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Selma, T.; Masud, M.M.; Bentaleb, A.; Harous, S. Inference Analysis of Video Quality of Experience in Relation with Face Emotion, Video Advertisement, and ITU-T P.1203. Technologies 2024, 12, 62. https://doi.org/10.3390/technologies12050062
Selma T, Masud MM, Bentaleb A, Harous S. Inference Analysis of Video Quality of Experience in Relation with Face Emotion, Video Advertisement, and ITU-T P.1203. Technologies. 2024; 12(5):62. https://doi.org/10.3390/technologies12050062
Chicago/Turabian StyleSelma, Tisa, Mohammad Mehedy Masud, Abdelhak Bentaleb, and Saad Harous. 2024. "Inference Analysis of Video Quality of Experience in Relation with Face Emotion, Video Advertisement, and ITU-T P.1203" Technologies 12, no. 5: 62. https://doi.org/10.3390/technologies12050062
APA StyleSelma, T., Masud, M. M., Bentaleb, A., & Harous, S. (2024). Inference Analysis of Video Quality of Experience in Relation with Face Emotion, Video Advertisement, and ITU-T P.1203. Technologies, 12(5), 62. https://doi.org/10.3390/technologies12050062