Next Article in Journal
Optimization and Application of Improved YOLOv9s-UI for Underwater Object Detection
Previous Article in Journal
Development of Mathematical Model for Coupled Dynamics of Small-Scale Ocean Current Turbine and Generator to Optimize Hydrokinetic Energy Harvesting Applications
Previous Article in Special Issue
High-Accuracy Classification of Multiple Distinct Human Emotions Using EEG Differential Entropy Features and ResNet18
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

A Systematic Literature Review of Modalities, Trends, and Limitations in Emotion Recognition, Affective Computing, and Sentiment Analysis

by
Rosa A. García-Hernández
1,
Huizilopoztli Luna-García
1,*,
José M. Celaya-Padilla
1,
Alejandra García-Hernández
1,
Luis C. Reveles-Gómez
1,
Luis Alberto Flores-Chaires
1,
J. Ruben Delgado-Contreras
1,
David Rondon
2 and
Klinge O. Villalba-Condori
3,*
1
Laboratorio de Tecnologías Interactivas y Experiencia de Usuario, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico
2
Departamento Estudios Generales, Universidad Continental, Arequipa 04001, Peru
3
Vicerrectorado de Investigación, Universidad Católica de Santa María, Arequipa 04001, Peru
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2024, 14(16), 7165; https://doi.org/10.3390/app14167165
Submission received: 11 July 2024 / Revised: 6 August 2024 / Accepted: 13 August 2024 / Published: 15 August 2024
(This article belongs to the Special Issue Application of Affective Computing)

Abstract

:
This systematic literature review delves into the extensive landscape of emotion recognition, sentiment analysis, and affective computing, analyzing 609 articles. Exploring the intricate relationships among these research domains, and leveraging data from four well-established sources—IEEE, Science Direct, Springer, and MDPI—this systematic review classifies studies in four modalities based on the types of data analyzed. These modalities are unimodal, multi-physical, multi-physiological, and multi-physical–physiological. After the classification, key insights about applications, learning models, and data sources are extracted and analyzed. This review highlights the exponential growth in studies utilizing EEG signals for emotion recognition, and the potential of multimodal approaches combining physical and physiological signals to enhance the accuracy and practicality of emotion recognition systems. This comprehensive overview of research advances, emerging trends, and limitations from 2018 to 2023 underscores the importance of continued exploration and interdisciplinary collaboration in these rapidly evolving fields.

1. Introduction

Emotion recognition, affective computing, and sentiment analysis are related research themes with distinct focuses, methodologies, and objectives, while sharing a common interest in understanding and utilizing human emotions and affective states. Although these fields have distinct focuses, recent studies using multimodal data have highlighted important emerging interrelations. Understanding the specific types of data utilized is more insightful than merely categorizing studies under one of these research themes, as it helps to comprehensively understand the evolving dynamics occurring recently between these fields. Over time, advancements in technology and machine learning have led to more accurate and versatile models in each field. These areas are continuously evolving as researchers explore new applications and improved techniques to enhance human-computer interaction and decision-making processes. For instance, EEG emotion recognition has been applied to analyze the impact of music on emotional changes in psychological healthcare, highlighting its potential in therapeutic settings [1]. In the financial sector, innovative research has utilized speech emotion recognition and text sentiment analysis to predict financial distress, demonstrating the applicability of these technologies in economic and risk assessment contexts [2]. Additionally, automated systems for job interview performance prediction leverage multimodal emotion analysis to enhance recruitment processes, showcasing the integration of affective computing in human resources [3].
Below, the general purpose of each of these three topics is described, along with some of the studies that have focused on each of them and their interrelations.
Emotion recognition primarily deals with the identification and understanding of human emotions expressed through various modalities, such as facial expressions, speech, text, and physiological signals, like electroencephalography (EEG), electrocardiography (ECG), electrodermal activity (EDA), heart rate (HR), photoplethysmography (PPG), blood volume pulse (BVP), electromyography (EMG), galvanic skin response (GSR), electrooculography (EOG), and temperature and respiration signals.
The main goal in emotion recognition is to accurately detect and classify specific emotions (e.g., happiness, sadness, anger) in individuals or groups. This is often used for applications like human-computer interaction, e-learning, surveillance, and healthcare [1,4,5]. The following three studies serve as examples of articles containing the term “emotion recognition” in their titles: “Spatio-Temporal Encoder-Decoder Fully Convolutional Network for Video-Based Dimensional Emotion Recognition” [6], “Applying Self-Supervised Representation Learning for Emotion Recognition Using Physiological Signals” [7], and “A Multitask Learning Model For Multimodal Sarcasm, Sentiment And Emotion Recognition In Conversations” [8]. Despite all including the concept of emotion recognition in their titles, their approaches are markedly distinct. This underscores the importance of analyzing and classifying studies in this systematic review based on the type of data used in their analysis, rather than solely on the keywords of the research field to which they belong. When analyzing the types of data used, these studies can be categorized as follows: the first one falls into the “unimodal” category, given its focus only on facial emotion analysis; the second could be classified as “multi-physiological”, as it integrates various physiological signals for emotion recognition; and the third can be categorized as “multi-physical”, as it examines visual, acoustic, and textual physical signals derived from dialogue videos. In a similar fashion, the studies examined in this review were categorized based on the type of data they investigate.
Affective computing is a broader field that encompasses emotion recognition but also considers the broader spectrum of human affective states, which include emotions, moods, and attitudes. It takes into account not only human expressions but also the interpretation of those expressions by machines. Its objective is to enable computers to recognize and respond to human emotional cues to enhance user experiences. This includes recognizing user emotions, adapting system behavior accordingly, and even generating emotional responses [9,10]. Similar to the previously mentioned examples, the following studies feature the term “affective computing” in their titles: “Deep learning for affective computing: Text-based emotion recognition in decision support” [11], “Asian Affective and Emotional State (A2ES) Dataset of ECG and PPG for Affective Computing Research” [12], and “Utilizing Deep Learning Towards Multi-Modal Bio-Sensing and Vision-Based Affective Computing” [13]. However, when examining the types of data employed, these studies can be categorized as follows: the first one falls within the “unimodal” category, as it primarily focuses only on textual data. The second can be classified as “multi-physiological”, given its integration of ECG and PPG signals for affective computing research. Lastly, the third can be categorized as “multi-physical-physiological” since it examines both facial signals, which are physical data, and various physiological signals, for affective computing research.
Sentiment analysis, also known as opinion mining, is specifically focused on determining the sentiment or opinion expressed in text or speech. It is often applied to social media, product reviews, and customer feedback. Its primary goal is to classify text as positive, negative, or neutral, and sometimes to identify more specific sentiments, like joy, anger, or sadness. The focus is on understanding public opinion and feedback. Based on the types of data, the following two studies can be categorized as “multi-physical”: “An Ensemble-Learning-Based Technique for Bimodal Sentiment Analysis” [14] and “Tree-Based Mix-Order Polynomial Fusion Network for Multimodal Sentiment Analysis” [15], since the first one examines audio and text, while the second one analyzes audio, text, and visual physical data. On the other hand, the study titled “An Efficient Deep Learning for Thai Sentiment Analysis” [16] falls into the “unimodal” category as it solely employs textual data for its analysis.
As technology in data processing and machine learning has advanced, the research approaches of emotion recognition, affective computing, and sentiment analysis have become increasingly interconnected. The emergence of techniques such as deep learning has revolutionized the way we address these subjects, enabling more precise and sophisticated analysis of emotional expressions in multimodal data, including images, audio, text, and physiological signals [13,17,18,19]. Furthermore, these disciplines have naturally begun to merge as real-world applications, such as healthcare [4,20]; human–machine interaction like in the analysis of drivers’ emotions [21,22]; e-learning [23]; and social media data analysis [24,25]. These applications demand integrated approaches.
In the literature, there are noteworthy recent reviews covering topics such as emotion recognition, affective computing, and sentiment analysis. Several of these reviews exhibit diverse approaches and objectives. Some studies concentrate on specific data types, such as those exclusively delving into physiological signal analysis [26,27,28] or physical cues, like facial emotion recognition [9,29]. Some studies focus on the sensors employed for data collection [30,31], while others examine the types of machine learning algorithms used for data analysis. For example, the study by Rouast P. et al., in 2021, which emphasizes deep learning investigations related to human affect recognition analysis, provides valuable insights and developments [32]. Similarly, the study by Ahmed N. et al., published in 2023, systematically reviews studies on multimodal emotion recognition employing machine learning algorithms [33]. However, there is currently no comprehensive review that employs these three specific concepts in its search string terms and strives to collectively address recent studies. Such an approach, involving their categorization based on the type of data they analyze, would facilitate a comprehensive understanding of how these recent studies, particularly those spanning 2018 to 2023, are interrelated and evolving in terms of trends in modalities, applications, data analysis models, and the origins of their analytic data. This analysis is crucial because it allows for the identification of best practices across different research themes. By comparing studies that may initially appear unrelated but utilize similar modalities in their data, we can uncover valuable insights and techniques that can be applied across various other fields, such as cognitive sciences. This multidisciplinary perspective not only enhances our understanding of the advancements in each field, but also promotes the adoption of successful practices from one area to another, enriching the overall research landscape. This study aims to provide a comprehensive overview of how these fields have advanced in terms of applications, learning models, and datasets, and how they have established essential connections to address the complex demands of everyday life in an increasingly digital and user-experience-oriented world.
The rest of this document is organized as follows. Section 2 describes the methodology employed, including the research questions and the search process to select relevant studies. Section 3 presents the results. Section 4 provides a comprehensive discussion, delving into the major findings in detail. Lastly, Section 5 is dedicated to the conclusions drawn from the analysis.

2. Methodology

This study was led following the essential guidelines for performing a systematic literature review (SLR) proposed by Kitchenham B. [32]. The key procedure instructions involve a well-structured planning, execution, and detailed reporting, as shown in Figure 1. Phase one is dedicated to the planning, involving the definition of the research questions, objectives, and scope of the review, as well as to the development of a clear search strategy, including search strings and criteria for article inclusion. In phase two, the conducting phase reviewers systematically search across multiple databases and select relevant articles based on predefined criteria; data extraction and quality assessment are also integral parts of this phase. Finally, in phase three, the reporting stage, the SLR is presented with transparency, providing a clear and reproducible account of the search process, data extraction, and findings.
Prior to undertaking the SLR, an extensive analysis of the existing literature, including previous research, surveys, reviews, and other SLRs, was conducted. This analysis revealed a compelling need to examine existing studies that included any of the three defined search terms in their titles. Although some studies mention only one of these terms in their titles, the analysis of their data reveals that they often implied more than one or even all three terms in their approach. Therefore, it was considered essential to conduct an SLR that could perform an interdisciplinary analysis of all these recent studies. This will provide a clear understanding of how these concepts are intrinsically interconnected and how they have evolved in terms of applications, data analysis models used, and information sources employed. Furthermore, in light of the rapid advancement of technology and knowledge in recent years, these approaches are taking different directions, which is crucial to understand as they are generating new constraints and areas of opportunity that will need to be addressed for future research. Additionally, the PRISMA flow diagram, located in Appendix A, was employed to outline the steps taken by researchers during the paper selection process. The main phases outlined in the PRISMA flow diagram include identification, screening, and inclusion.
The general methodology of the review focuses on a comprehensive examination of academic literature related to “affective computing”, “sentiment analysis”, and “emotion recognition” published between 2018 and 2023. The primary objective is to address the research questions (RQs) outlined below in Section 2.1. The search process for selecting the studies is detailed in Figure 2. These studies were then categorized according to the type of information they analyze (unimodal, multi-physical, multi-physiological, or multi-physical–physiological). Furthermore, information pertaining to the data analysis models utilized, the applications addressed in the studies, the information sources employed, and common research limitations were subsequently extracted.

2.1. Research Questions

The following questions are addressed in this study:
RQ1. 
What are the prevailing trends in research on “affective computing”, “sentiment analysis”, and “emotion recognition” from 2018 to 2023, in terms of unimodal, multi-physical, multi-physiological, and multi-physical–physiological approaches?
RQ2. 
What are the most commonly used data analysis models in studies of “affective computing”, “sentiment analysis”, and “emotion recognition”, and how have they evolved in terms of applications and approaches over the examined period?
RQ3. 
What are the most frequently employed sources of information and databases in research related to “affective computing”, “sentiment analysis”, and “emotion recognition”, and what limitations do studies encounter when using these sources in their analyses?

2.2. Search Process

Figure 2 describes the search process followed in order to select the pertinent studies that would help to solve the RQs. Four databases were carefully selected to conduct this review, as they include specialized journals that cover the central topics of this research. In addition to their thematic focus, these databases provide access to a wide range of articles. This combination of thematic relevance and availability of information was vital to gain a comprehensive and up-to-date understanding of the most important research in the area of interest. These databases were IEEE, Springer, Science Direct, and MDPI.

2.2.1. Search Terms

The search process involved the careful choice of keywords, which included “emotion recognition”, “sentiment analysis”, and “affective computing”. These specific terms were further adapted as search queries to align with the research questions and to accommodate the unique requirements of each search engine, considering their respective resource databases.

2.2.2. Inclusion and Exclusion Criteria

For the inclusion and exclusion criteria of this review, initial searches were conducted in the databases using the designated keywords, with a focus on retrieving results where any of these keywords appeared in the publication title. Subsequently, a filter was applied to include only publications from the year 2018 to June 2023.
Additional filters were then implemented to exclusively consider peer-reviewed scientific articles, excluding conference papers and other non-scientific publications. The results were further sorted by relevance in each database, which consider factors like keyword frequency, journal impact factor, and citation count. The top 200 most relevant articles from each database were then selected to ensure a comprehensive yet manageable analysis. The outcomes of these searches are presented in Table 1.

2.2.3. Quality Assessment

Due to the nature of this study and in accordance with the diverse research questions it encompasses, the quality assessment of the included studies was conducted as follows. Given that the focus of the review analysis revolves around categorizing studies into four distinct categories, a scoring system was employed. Studies that explicitly defined the type of data they utilized were assigned 2 points, while those studies whose primary focus was not on emotions but still delineated the type of data received 1 point. Studies that did not include this specification or were deemed irrelevant to the research topic, such as those focusing on emotional recognition from a psychological perspective that explores individuals’ cognitive abilities in recognizing emotions in others, were allocated 0 points. After this assessment, only the studies with 1 or 2 points were kept for the next phase. The quantities of articles kept from each database are listed in Table 2.

2.2.4. Data Extraction

Once the selected studies were listed, for the specific analysis in accordance with the research questions previously outlined, an abstract review was undertaken to categorize each study. In cases where the information within the abstract did not specify the type of data analyzed, an examination of the methodology section was conducted. Table 3 provides an overview of the categorization of the studies, which serves as a preliminary step for in-depth analysis of the research questions.
Moreover, this review encompassed the extraction of information related to the type of data used and benchmark databases that have gained prominence in the field and have been consistently utilized in the analyzed articles. Additionally, trends in data analysis models, including machine learning, deep learning, neural networks, transformers, and related approaches, were identified and tabulated.
The review also involved gathering data on the various application domains where the research on emotion recognition, sentiment analysis, and affective computing has been applied. This holistic approach allows for a more thorough understanding of the research landscape. These findings will be presented in detail within the results section.

3. Results

3.1. Overview

The results section is organized in such a way that the results are described according to the categories defined for the classification of the studies, which were established based on the type of data used in the studies. Figure 3 provides an overview of the research modalities from 2018 to 2023, showing that studies with unimodal data analysis have predominantly been conducted.
There is a noticeable and drastic increase in the years 2022 and 2023. Similarly, albeit to a much lesser extent, it can be observed that in these last two years, approaches involving the analysis of multi-physical data have also increased. Subsequent sections will delve into the specific types of data studied within these modalities, as well as the machine learning models for data analysis, databases, and application trends.

3.2. Unimodal Data Approaches

As is discernible from Figure 4, studies centered on the analysis of unimodal data were further categorized into those employing signals or physical data and those employing signals or physiological data. Amongst studies utilizing physical signals, it is evident that, over the period covered by this study, emotion analysis using facial analysis, speech analysis, and textual analysis has been dominant. The latter, in particular, experienced an increase of 200% in the year 2023 compared to the previous year. This remarkable uptick can be attributed to the abundance of emotional textual data available on various platforms such as social networks, product reviews, and blog comments.
From the articles analyzed in this systematic review, 155 examined text-based information. Among these, 59 indicated that they had extracted data from social networks, blog comments, or review-type comments. Notably, 41% of these articles specifically sourced data from Twitter.
Concerning studies employing unimodal physiological signals, EEG signals are a prominent choice, showcasing a remarkable capacity for emotion generalization, with a substantial acceleration in the quantity of studies in the years 2022 and 2023. Additionally, various other physiological signals are starting to be explored for their potential in emotion generalization, including ECG, EDA, HR, and PPG signals, although they are still very limited in number.

3.2.1. Unimodal Physical Approaches

Unimodal facial emotion recognition has been extensively studied in recent years. In this study, the aim is to understand the new trends in terms of databases, models, and applications. Table 4 lists the titles of articles with a specific unimodal focus on facial emotion recognition (from the studies included in this systematic review) that describe the databases they used for their study. This list reveals which databases are most frequently utilized.
Some of the databases included in the articles listed in Table 4 are considered benchmarks in the field facial emotion recognition. Among the most frequently repeated databases in the studies included in this review, the following can be identified: JAFFE (1997), CK+ [49] Oulu-CASIA [50] FER2013 (2013), OMG-One-Minute Gradual-Emotion Dataset, [51], and AffectNet [35]. Below is a brief description of each of these, highlighting their advantages and limitations.
JAFFE (Japanese Female Facial Expression 1997): JAFFE is a database that focuses on facial expressions of Japanese women. It contains images of seven basic emotions, including happiness, sadness, anger, surprise, fear, disgust, and neutrality. The images were captured in a controlled environment where subjects were provided with visual stimuli or specific instructions to express basic emotions. This database is useful for studying cultural differences in emotional expression. JAFFE was created at the University of Kyoto in Japan in the 1990s and was published in 1997.
CK+ (Cohn-Kanade Database) [49]: CK+ is a database of facial expression images featuring actors representing basic emotions such as happiness, sadness, anger, surprise, fear, contempt, and neutrality. The images were captured in controlled conditions and are accurately labeled. CK+ is renowned for its quality and precision in emotional labeling. Emotions were not artificially induced in a controlled environment; instead, they were captured as actors genuinely expressed them. It was developed in the University of Pittsburgh’s laboratory and released in multiple phases over time, with the complete version made public in 2010.
Oulu-CASIA [50]: Developed by the University of Oulu in Finland in 2011. It focuses on the authenticity of facial expressions and contains images of Finnish subjects recorded under controlled conditions while expressing both genuine and fake facial emotions. It is valuable for research involving the detection of genuine and fake facial expressions, which is relevant in lie detection applications. It includes basic emotions such as happiness, surprise, sadness, anger, contempt, fear, and neutrality.
FER2013 (Facial Expression Recognition 2013): Created in 2013, it was used in a Kaggle competition called the “Facial Expression Recognition Challenge”. This database includes images of facial expressions in real-world situations. While it contains a vast amount of data, the quality of labels may vary as some images come from online sources.
OMG-One-Minute Gradual-Emotion Dataset [51]: This relatively new dataset, published in 2018, focuses on representing emotions with gradually evolving facial expressions in one-minute video clips. It is designed to capture the transition of emotions and subtle changes in facial expressions over time. The labels represent the gradual evolution of emotions throughout the videos collected from a variety of YouTube channels. This dataset is used for training and evaluating emotion recognition algorithms to understand how emotions manifest over time and how algorithms can recognize and track these emotional transitions in video sequences.
AffectNet [35]: This is a database of approximately 1 million images used for facial emotion recognition. It was presented at the 2017 IEEE International Conference on Automatic Face and Gesture Recognition. It contains a wide variety of subjects, with diversity in races, ages, and genders. Each image is labeled with the emotion represented by the person’s face. They use the model of primary emotions, such as happiness, sadness, anger, surprise, fear, disgust, and neutrality. The images were collected from online sources, meaning that specific emotions were not induced in a controlled environment. Instead, emotions were retrospectively labeled by human annotators who evaluated the images. It presents challenges, such as the variability of human emotions and inherent limitations in emotion labeling in images. These aspects can impact the accuracy of facial recognition models.
In the facial emotion recognition studies reviewed, the predominant machine learning models include convolutional neural networks (CNNs), transformer-based and deep learning models, and long short-term memory (LSTM) networks. Additionally, transfer learning, ensemble methods, and hybrid models that combine various approaches also play a significant role.
To illustrate how these common AI models are specifically tailored and applied to address unique challenges in facial emotion recognition, the following example highlights their distinct and context-specific applications within this research domain:
In their research, Zia Ullah et al. [52] present an advanced technique for facial emotion detection, organized into three phases: super-resolution, emotion recognition, and classification. This approach employs an improved Deep CNN, the Viola–Jones algorithm, and various feature extraction methods to enhance image quality and emotion detection accuracy. The classification phase utilizes RNN and Bi-GRU networks with a score level fusion mechanism, outperforming traditional methods.
On the other hand, Zheng W. et al. [53] introduce a new method to improve facial expression recognition when the training and testing images come from different sources, which is often difficult because of differences in the images. The method uses transfer learning by combining labeled images from one source with unlabeled images from another source to learn a common feature space and predict labels for the unlabeled images using a transductive transfer regularized least-squares regression (TTRLSR) model. After this, a SVM classifier is trained on these images to classify expressions in the target images. The method also uses color features from key points on the face to improve accuracy. Tests on two different databases show that this new approach, which integrates TTRLSR and SVM, performs better than current methods.
Table 5 lists the trends in applications and focus areas in facial emotion recognition studies included in this review.

3.2.2. Unimodal Speech Data Approaches

In the case of unimodal speech emotion recognition articles, databases exhibit significant diversity due to the nature of the data, comprising databases in numerous languages. For this reason, no specific results are included concerning databases; however, these are the identified trends in learning models and applications.
  • Several articles mention the use of transfer learning for speech emotion recognition. This technique involves training models on one dataset and applying them to another. This can improve the efficiency of emotion recognition across different datasets.
  • Some articles discuss multitask learning models, which are designed to simultaneously learn multiple related tasks. In the context of speech emotion recognition, this approach may help capture commonalities and differences across different datasets or emotions.
  • Data augmentation techniques are mentioned in multiple articles, which involve generating additional training data from existing data, which can improve model performance and generalization.
  • Attention mechanisms are a common trend for improving emotion recognition. Attention models allow the model to focus on specific features or segments of the input data that are most relevant for recognizing emotions, such as in multi-level attention-based approaches.
  • Many articles discuss the use of deep learning models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and some variants like “Two-Stage Fuzzy Fusion Based-Convolution Neural Network, “Deep Convolutional LSTM”, and “Attention-Oriented Parallel CNN Encoders”.
  • While deep learning is prevalent, some articles explore novel feature engineering methods, such as modulation spectral features and wavelet packet information gain entropy, to enhance emotion recognition.
  • From the list of articles on unimodal emotion recognition through speech, 7.14% address the challenge of recognizing emotions across different datasets or corpora. This is an important trend for making emotion recognition models more versatile.
  • A few articles focus on making emotion recognition models more interpretable and explainable, which is crucial for real-world applications and understanding how the model makes its predictions.
  • Ensemble methods, which combine multiple models to make predictions, are mentioned in several articles as a way to improve the performance of emotion recognition systems.
  • Some articles discuss emotion recognition in specific contexts, such as call/contact centers, school violence detection, depression detection, analysis of podcast recordings, noisy environment analysis, in-the-wild sentiment analysis, and speech emotion segmentation of vowel-like and non-vowel-like regions. This indicates a trend toward applying emotion recognition in diverse applications.

3.2.3. Unimodal Text Data Approaches

Text analytics for emotion recognition has experienced significant growth in recent years, driven by a wide range of readily available and easily accessible data sources such as social media comments on Twitter, Facebook, Instagram, and blogs. Other researchers have used stock market data and tourism-related reviews; these sources offer valuable insight into the perceptions and emotions of investors in the financial markets, as well as the experiences and opinions of travelers in the tourism industry. These data have enabled the training and validation of text analytics models. Among the most prominent trends are deep learning models, such as LSTM, CNNs, bidirectional long short-term memory (Bi-LSTM), and gated recurrent units (GRUs), which are a variant of RNNs but designed to address the problem of gradient fading often found in standard RNNs. This is addressed by introducing “gates” that regulate the flow of information within the unit. These gates allow the GRU to retain long-term information and discard irrelevant information. These approaches also include their hybrid variants. In addition, transformer-based models, such as bidirectional encoder representations from transformers (BERT), have emerged as one of the latest trends. These models were developed by Google and are known for their capability to understand the context of a word in a sentence from two directions (bidirectional), which is a significant improvement over previous models that read text sequentially (either left-to-right or right-to-left). Moreover, the BERT model architecture uses attention mechanisms to weight the influence of different parts of the input on the output, which allows the model to learn complex contexts and relationships between words and sentences, significantly improving performance on a wide range of natural language processing tasks, such as text comprehension, machine translation, and question answering. A good example of the application of these models is provided by the study reported by Tan K. et al. [54], which presents a new model for sentiment analysis that combines the BERT-based transformer model RoBERTa and the recurrent neural network model GRU. RoBERTa helps understand the text using attention mechanisms, while GRU handles long-term information and avoids common issues like vanishing gradients. To address imbalanced datasets, the model uses data augmentation to increase the representation of minority classes. The RoBERTa-GRU model was tested on IMDb, Sentiment140, and Twitter US Airline Sentiment datasets, achieving high accuracy rates, demonstrating its strong performance in sentiment analysis.

3.2.4. Unimodal Physiological Data Approaches

Based on the analysis of the articles that were classified within the unimodal category with physiological signals, and as shown before in Figure 4 (right portion), the following can be summarized:
  • EEG signals continue to dominate the field of study as they provide direct information about brain activity and fluctuations in brain waves, which can reflect emotional states with high accuracy. EEG is particularly suitable for detecting subtle changes in emotional state and identifying specific patterns in the brain related to emotions. However, it requires the placement of electrodes on the scalp, which can be uncomfortable and limit mobility, and may be more susceptible to artifacts and noise than other physiological signals. During 2022 and 2023, the studies involving emotion recognition with these signals have grown a lot and some of the identified trends are as follows: using EEG signals to enhance human–computer interaction, especially in applications where an intuitive understanding of human emotions is required; some works focus on emotion recognition for patients with disorders of consciousness, during movie viewing, in virtual environments, and in driving scenarios; there is also a trend towards using EEG to aid in the detection and monitoring of mental health issues such as depression. There is a trend of exploring personalization methods in emotion recognition being applied to individual user characteristics, suggesting a direction towards more individualized and user-specific systems. In terms of the analysis models, most works focus on recognizing emotions from EEG signals using a variety of deep learning approaches and signal processing techniques. CNNs, RNNs, and combinations of them are widely used, but there are other trends like the following:
    • Attention and self-attention mechanisms: These suggest that researchers are paying attention to the relevance of different parts of EEG signals for emotion recognition.
    • Generative adversarial networks (GANs): Used for generating synthetic EEG data in order to improve the robustness and generalization of the models.
    • Semi-supervised learning and domain transfer: Allow emotion recognition with limited datasets or datasets that are applicable to different domains, suggesting a concern for scalability and generalization of models.
    • Interpretability and explainability: There is a growing interest in models that are interpretable and explainable, suggesting a concern for understanding how models make decisions and facilitating user trust in them.
    • Utilization of transformers and capsule networks: Newer neural network architectures such as transformers and capsule networks are being explored for emotion recognition, indicating an interest in enhancing the modeling and representation capabilities of EEG signals.
  • Although studies with a unimodal physical approach using signals different from EEG, like ECG, EDA, HR, and PPG, are still scarce, these can provide information about the cardiovascular system and the body’s autonomic response to emotions. Their limitations are that they may not be as specific or sensitive in detecting subtle or changing emotions. Noise and artifacts, such as motion, can affect the quality of these signals in practical situations and can be influenced by non-emotional factors, such as physical exercise and fatigue. Various studies explore the utilization of ECG and PPG signals for emotion recognition and stress classification. Techniques such as CNNs, LSTMs, attention mechanisms, self-supervised learning, and data augmentation are employed to analyze these signals and extract meaningful features for emotion recognition tasks. Bayesian deep learning frameworks are utilized for probabilistic modeling and uncertainty estimation in emotion prediction from HB data. These approaches aim to enhance human–computer interaction, improve mental health monitoring, and develop personalized systems for emotion recognition based on individual user characteristics.

3.3. Multi-Physical Data Approaches

In Figure 5, the x-axis displays the various modalities found in articles classified as multi-physical because they use more than one type of physical data in their analysis. On the y-axis, the quantity of articles for each modality and their distribution across the years are shown. Among the most prominent are studies that use the fusion of audio–text–visual, audiovisual, and audio–text modalities. Additionally, there is a notable trend with a focus on facial analysis fused with micro-expressions, which are subtle and fleeting emotional signals. This demonstrates a growing interest in understanding and detecting emotions in situations where facial expressions are brief and difficult to capture. From additional analysis of articles classified as multi-physical, several important points can be highlighted:
  • Most studies employ CNNs and RNNs, while others utilize variations of general neural networks, such as spiking neural networks (SNN) and tree-based neural networks. SNNs represent and transmit information through discrete bursts of neuronal activity, known as “spikes” or “pulses”, unlike conventional neural networks, which process information in continuous values. Additionally, several studies leverage advanced analysis models such as the stacked ensemble model and multimodal fusion models, which focus on integrating diverse sources of information to enhance decision-making. Transfer learning models and hybrid attention networks aim to capitalize on knowledge from related tasks or domains to improve performance in a target task. Attention-based neural networks prioritize capturing relevant information and patterns within the data. Semi-supervised and contrastive learning models offer alternative learning paradigms by incorporating both labeled and unlabeled data.
  • The studies address diverse applications, including sarcasm, sentiment, and emotion recognition in conversations, financial distress prediction, performance evaluation in job interviews, emotion-based location recommendation systems, user experience (UX) analysis, emotion detection in video games, and in educational settings. This suggests that emotion recognition thorough multi-physical data analysis has a wide spectrum of applications in everyday life.
  • Various audio and video signal processing techniques are employed, including pitch analysis, facial feature detection, cross-attention, and representational learning.
Moreover, Table 6 lists the databases identified from the studies classified as multi-physical, offering a brief description of each, including their respective advantages and limitations. Some of the references next to the database name indicate one of the studies in which this database was used, or the actual study in which the database was made public.

3.4. Multi-Physiological Data Approaches

In Figure 6, the x-axis displays the signal modalities present in articles classified as multi-physiological, indicating the use of more than one type of physiological signal data in their analysis. On the y-axis, the chart represents the number of articles for each different modality. This category comprises 21 articles, out of which 9 utilize their own data for analysis. The remaining studies make use of the 6 databases described in Table 7, known as benchmark databases for affective computing using physiological signals. Some of the references next to the database name indicate one of the studies in which this database was used, or the actual study in which the database was made public.
It is important to note that some of these databases also include audiovisual data, which is why they are also mentioned in some of the multi-physical–physiological modality analysis.
As a result of the analyzing articles classified as multi-physiological, which use fusions of physiological signals, several important points can be highlighted:
  • The fusion of physiological signals, such as EEG, ECG, PPG, GSR, EMG, BVP, EOG, respiration, temperature, and movement signals, is a predominant trend in these studies. The combination of multiple physiological signals allows for a richer representation of emotions.
  • Most studies apply deep learning models, such as CNNs, RNNs, and autoencoder neural networks (AE), for the processing and analysis of these signals. Supervised and unsupervised learning approaches are also used.
  • These studies focus on a variety of applications, such as emotion recognition in healthcare environments, brain–computer interfaces for music, emotion detection in interactive virtual environments, stress assessment in mobility environments for visually impaired people, among others. This indicates that emotion recognition based on physiological signals has applications in healthcare, technology, and beyond.
  • Some studies focus on personalized emotion recognition, suggesting tailoring of models for each individual. This may be relevant for personalized health and wellness applications. Others focus on interactive applications and virtual environments useful for entertainment and virtual therapy.
  • It is important to mention that the studies within this classification are quite limited in comparison to the previously described modalities. Although it appears that they are using similar physiological signals, the databases differ in terms of their approaches and generation methods. Therefore, there is an opportunity to establish a protocol for generating these databases, allowing for meaningful comparisons among studies.

3.5. Multi-Physical–Physiological Data Approaches

Figure 7 illustrates the various combinations of physical and physiological signals identified in the 18 studies classified as multi-physical–physiological. As depicted in the graph, the prevalence of this type of study remains limited. Most of these studies were conducted from 2021 to 2023, which are the last 3 years from the 6-year timeframe this study encompasses, except for two studies that were published in 2018; therefore, it can be said that this is a relatively new trend. Based on the analysis of these studies, these trends can be highlighted:
  • Studies tend to combine multiple types of signals, such as EEG, facial expressions, voice signals, GSR, and other physiological data. Combining signals aims to take advantage of the complementarity of different modalities to improve accuracy in emotion detection.
  • Machine learning models, in particular CNNs, are widely used in signal fusion for emotion recognition. CNN models can effectively process data from multiple modalities.
  • Applications are also being explored in the health and wellness domain, such as emotion detection for emotional health analysis of people in smart environments.
  • The use of standardized and widely accepted databases is important for comparing results between different studies; however, these are still limited.
  • The trend towards non-intrusive sensors and wireless technology enables data collection in more natural and less intrusive environments, which facilitates the practical application of these systems in everyday environments.

4. Discussion

The efforts throughout this research have been focused on providing robust and well-founded answers to the key questions that drove this study. RQ1 asks what are the prevailing trends in research on “affective computing”, “sentiment analysis”, and “emotion recognition” from 2018 to 2023, in terms of unimodal, multi-physical, multi-physiological, and multi-physical–physiological approaches. Addressing this RQ, from the detailed examination of studies classified under the different modalities, several prominent trends have emerged.
Among the unimodal approaches analyzed in this systematic review, facial expression recognition, text analysis, and speech recognition have garnered significant attention based on the volume of published studies. In terms of physical signals, these modalities have emerged as prominent areas of research. Additionally, the analysis of EEG signals has notably stood out among physiological signal analyses. Considering these findings, the prevailing trends in applications within these domains are as follows:
  • Facial expression analysis approaches are currently being applied across various domains, including naturalistic settings (“in the wild”), on-road driver monitoring, virtual reality environments, smart homes, IoT and edge devices, and assistive robots. There is also a focus on mental health assessment, including autism, depression, and schizophrenia, and distinguishing between genuine and unfelt facial expressions of emotion. Efforts are being made to improve performance in processing faces acquired at a distance despite the challenges posed by low-quality images. Furthermore, there is an emerging interest in utilizing facial expression analysis in human–computer interaction (HCI), learning environments, and multicultural contexts.
  • The recognition of emotions through speech and text has experienced tremendous growth, largely due to the abundance of information facilitated by advancements in technology and social media. This has enabled individuals to express their opinions and sentiments through various media, including podcast recordings, live videos, and readily available data sources such as social media platforms like Twitter, Facebook, Instagram, and blogs. Additionally, researchers have utilized unconventional sources like stock market data and tourism-related reviews. The variety and richness of these data sources indicate a wide range of segments where such emotion recognition analyses can be applied effectively.
  • EEG signals continue to be a prominent modality for emotion recognition due to their highly accurate insights into emotional states. Between 2022 and 2023, studies in this field experienced exponential growth. The identified trends include utilizing EEG for enhancing human–computer interaction, recognizing emotions in various contexts such as patients with consciousness disorders, movie viewing, virtual environments, and driving scenarios. EEG is being used for detecting and monitoring mental health issues. There is also a growing focus on personalization, leading towards more individualized and user-specific emotion recognition systems, Other physiological signals, such as ECG, EDA, and HR, are also gaining attention, albeit at a slower pace.
  • In the realm of multi-physical, multi-physiological, and multi-physical–physiological approaches, it is the former that appears to be laying the groundwork, as evidenced by the abundance of studies in this area. The latter two approaches, incorporating fusions with physiological signals, are still relatively scarce but seem to be paving the way for future researchers to contribute to their growth. Multimodal approaches, which integrate both physical and physiological signals, are finding diverse applications in emotion recognition. These range from healthcare systems, individual and group mood research, personality recognition, pain intensity recognition, anxiety detection, work stress detection, stress classification and security monitoring in public spaces, to vehicle security monitoring, movie audience emotion recognition, applications for autism spectrum disorder detection, music interfacing, and virtual environments.
RQ2 asks what are the most commonly used data analysis models in studies of “affective computing”, “sentiment analysis”, and “emotion recognition”, and how have they evolved in terms of applications and approaches over the examined period. Addressing this RQ, we find that the articles in this review cover a wide range of topics within these fields. They utilize diverse data sources, including physiological signals, voice, text and facial expressions, to name a few. The diversity of these articles reflects the interdisciplinary nature of their applications in healthcare, social network analysis, human–computer interaction, and other fields. The extensive analysis conducted in this review highlights a variety of prominent data analysis models and approaches, as detailed in the following list:
  • Bidirectional encoder representations from transformers: Used in sentiment analysis and emotion recognition from text, BERT models can understand the context of words in sentences by pre-training on a large text and then fine-tuning for specific tasks like sentiment analysis.
  • CNNs: These are commonly applied in facial emotion recognition, emotion recognition from physiological signals, and even in speech emotion recognition by analyzing spectrograms.
  • RNNS and variants (LSTM, GRU): These models are suited for sequential data like speech and text. LSTMs and GRUs are particularly effective in speech emotion recognition and sentiment analysis of time-series data.
  • Graph convolutional networks (GCNs): Applied in emotion recognition from EEG signals and conversation-based emotion recognition, these can model relational data and capture the complex dependencies in graph-structured data, like brain connectivity patterns or conversational contexts.
  • Attention mechanisms and transformers: Enhancing the ability of models to focus on relevant parts of the data, attention mechanisms are integral to models like transformers for tasks that require understanding the context, such as sentiment analysis in long documents or emotion recognition in conversations.
  • Ensemble models: Combining predictions from multiple models to improve accuracy, ensemble methods are used in multimodal emotion recognition, where inputs from different modalities (e.g., audio, text, and video) are integrated to make more accurate predictions.
  • Autoencoders and generative adversarial networks (GANs): For tasks like data augmentation in emotion recognition from EEG or for generating synthetic data to improve model robustness, these unsupervised learning models can learn compact representations of data or generate new data samples, respectively.
  • Multimodal fusion models: In applications requiring the integration of multiple data types (e.g., speech, text, and video for emotion recognition), fusion models combine features from different modalities to capture more comprehensive information for prediction tasks.
  • Transfer learning: Utilizing pre-trained models on large datasets and fine-tuning them for specific affective computing tasks, transfer learning is particularly useful in scenarios with limited labeled data, such as sentiment analysis in niche domains.
  • Spatio-temporal models: For tasks that involve data with both spatial and temporal dimensions (like video-based emotion recognition or physiological signal analysis), models that capture spatio-temporal dynamics are employed, combining approaches like CNNs for spatial features and RNNs/LSTMs for temporal features.
These models and approaches are tailored to the specific characteristics of the data and the requirements of each task, reflecting the rich methodological diversity in the field of affective computing and emotion recognition and sentiment analysis.
Specific programming languages and frameworks are predominantly used in the implementation of these AI models to facilitate the use of large datasets and complex architectures. Python is the most widely used language due to its versatility and because it has a wide range of specialized libraries. Frameworks such as TensorFlow, PyTorch, and Keras are essential for the development of deep neural networks and other advanced models.
In the realm of RQ3, which asks what are the most frequently employed sources of information and data-bases in research related to “affective computing”, “sentiment analysis”, and “emotion recognition”, and what limitations do studies encounter when using these sources, it has been observed that sources of information and databases, particularly in the text analytics domain, have witnessed substantial growth due to the availability of large volumes of data from sources like social media, product reviews, and blog comments. Deep learning models, transformer-based models like BERT, and the combination of deep learning with different techniques have become prominent trends in text analytics. Nevertheless, the availability of databases for physiological signals or the fusion of physical and physiological signals remains limited, and standard protocols for generating these databases have yet to be established. Consequently, comparing results across various studies remains challenging. Thus, it is evident that there is still a considerable journey ahead in the analysis of emotions using non-EEG physiological signals, which are presently somewhat confined to controlled laboratory applications, paving a trend for the use of non-intrusive sensors and wireless technology to facilitate data collection in natural environments, providing valuable insights into real-world emotional states. As technology advances and interdisciplinary collaboration continues, it is expected that these limitations will be addressed, paving the way for more robust and comprehensive emotion analysis in a wide range of practical applications.
The analysis of these trends in “affective computing”, “sentiment analysis”, and “emotion recognition” research provides a comprehensive understanding of the evolving landscape in data sources, analysis models, and applications. Researchers must consider these trends and the advantages and limitations of databases when conducting studies in these domains.

5. Conclusions

In conclusion, there has been a notable increase in the volume of research studies focusing on unimodal approaches, specifically those related to facial expression analysis, text sentiment analysis, and dialogue emotion recognition. This surge in research activity can be attributed to significant advancements in information processing techniques, natural language understanding, and the growing accessibility to online data sources. In the realm of physiological signals, a noteworthy trend has been the rising interest in studies centered around EEG signals, which have been extensively explored and have demonstrated exceptional results in emotion recognition tasks. However, such studies often require medical-grade or intrusive devices. Although there have been advancements in creating more compact versions of these devices, they still cannot match the convenience and non-intrusiveness of, for example, a smartwatch, which is already a part of everyday life for many people. Therefore, the development of technologies that can utilize non-intrusive devices to achieve similar levels of effectiveness as EEG-based systems remains a significant challenge. Additionally, the modality of fusing multi-physiological and physical signals could be key to addressing this problem. This approach could potentially enhance the accuracy and practicality of emotion recognition systems, making them more suitable for everyday use. This underscores the importance of continuing to explore and develop diverse modalities within emotion recognition research. However, the limitation for this development lies in different complexities; these signals vary significantly between individuals, making it difficult to standardize them for broad-spectrum applications. The processing and analysis of physiological data is complex and requires the development of advanced algorithms to correctly interpret the signals. In addition, building a reliable and representative database requires a large sample of participants to capture interpersonal variability. Due to these complexities inherent in the use of physiological signals in affective computing, there is still a wide area of opportunity for the development of studies in this field. These challenges not only highlight the need for innovative approaches and more advanced technological solutions, but also underscore the importance of interdisciplinary research spanning areas such as psychology, computer science, and bioengineering. These solutions could lead to significant advances in how machines understand and respond to human emotional states, expanding the possibilities for their application in a variety of fields.

Author Contributions

Conceptualization, R.A.G.-H.; methodology, R.A.G.-H. and H.L.-G.; validation, R.A.G.-H., H.L.-G. and J.M.C.-P.; formal analysis, R.A.G.-H.; investigation, R.A.G.-H.; resources, A.G.-H., L.C.R.-G., L.A.F.-C., J.R.D.-C., D.R. and K.O.V.-C.; data curation, R.A.G.-H.; writing—original draft preparation, R.A.G.-H.; writing—review and editing, R.A.G.-H.; visualization, R.A.G.-H., H.L.-G. and J.M.C.-P.; supervision, H.L.-G., J.M.C.-P. and K.O.V.-C.; project administration, R.A.G.-H., H.L.-G. and J.M.C.-P.; funding acquisition, K.O.V.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank CONACYT for the support granted by their national scholarship program, CVU number 307551, to Rosa Adriana García Hernández.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Paper selection for literature review using PRISMA [67].
Applsci 14 07165 i001

References

  1. Zhou, T.H.; Liang, W.; Liu, H.; Wang, L.; Ryu, K.H.; Nam, K.W. EEG Emotion Recognition Applied to the Effect Analysis of Music on Emotion Changes in Psychological Healthcare. Int. J. Environ. Res. Public Health 2022, 20, 378. [Google Scholar] [CrossRef] [PubMed]
  2. Hajek, P.; Munk, M. Speech Emotion Recognition and Text Sentiment Analysis for Financial Distress Prediction. Neural Comput. Appl. 2023, 35, 21463–21477. [Google Scholar] [CrossRef]
  3. Naim, I.; Tanveer, M.d.I.; Gildea, D.; Hoque, M.E. Automated Analysis and Prediction of Job Interview Performance. IEEE Trans. Affect. Comput. 2018, 9, 191–204. [Google Scholar] [CrossRef]
  4. Ayata, D.; Yaslan, Y.; Kamasak, M.E. Emotion Recognition from Multimodal Physiological Signals for Emotion Aware Healthcare Systems. J. Med. Biol. Eng. 2020, 40, 149–157. [Google Scholar] [CrossRef]
  5. Maithri, M.; Raghavendra, U.; Gudigar, A.; Samanth, J.; Barua, D.P.; Murugappan, M.; Chakole, Y.; Acharya, U.R. Automated Emotion Recognition: Current Trends and Future Perspectives. Comput. Methods Programs Biomed. 2022, 215, 106646. [Google Scholar] [CrossRef] [PubMed]
  6. Du, Z.; Wu, S.; Huang, D.; Li, W.; Wang, Y. Spatio-Temporal Encoder-Decoder Fully Convolutional Network for Video-Based Dimensional Emotion Recognition. IEEE Trans. Affect. Comput. 2021, 12, 565–578. [Google Scholar] [CrossRef]
  7. Montero Quispe, K.G.; Utyiama, D.M.S.; dos Santos, E.M.; Oliveira, H.A.B.F.; Souto, E.J.P. Applying Self-Supervised Representation Learning for Emotion Recognition Using Physiological Signals. Sensors 2022, 22, 9102. [Google Scholar] [CrossRef]
  8. Zhang, Y.; Wang, J.; Liu, Y.; Rong, L.; Zheng, Q.; Song, D.; Tiwari, P.; Qin, J. A Multitask Learning Model for Multimodal Sarcasm, Sentiment and Emotion Recognition in Conversations. Inf. Fusion 2023, 93, 282–301. [Google Scholar] [CrossRef]
  9. Leong, S.C.; Tang, Y.M.; Lai, C.H.; Lee, C.K.M. Facial Expression and Body Gesture Emotion Recognition: A Systematic Review on the Use of Visual Data in Affective Computing. Comput. Sci. Rev. 2023, 48, 100545. [Google Scholar] [CrossRef]
  10. Aranha, R.V.; Correa, C.G.; Nunes, F.L.S. Adapting Software with Affective Computing: A Systematic Review. IEEE Trans. Affect. Comput. 2021, 12, 883–899. [Google Scholar] [CrossRef]
  11. Kratzwald, B.; Ilić, S.; Kraus, M.; Feuerriegel, S.; Prendinger, H. Deep Learning for Affective Computing: Text-Based Emotion Recognition in Decision Support. Decis. Support. Syst. 2018, 115, 24–35. [Google Scholar] [CrossRef]
  12. Ab. Aziz, N.A.; K., T.; Ismail, S.N.M.S.; Hasnul, M.A.; Ab. Aziz, K.; Ibrahim, S.Z.; Abd. Aziz, A.; Raja, J.E. Asian Affective and Emotional State (A2ES) Dataset of ECG and PPG for Affective Computing Research. Algorithms 2023, 16, 130. [Google Scholar] [CrossRef]
  13. Jung, T.-P.; Sejnowski, T.J. Utilizing Deep Learning Towards Multi-Modal Bio-Sensing and Vision-Based Affective Computing. IEEE Trans. Affect. Comput. 2022, 13, 96–107. [Google Scholar] [CrossRef]
  14. Shah, S.; Ghomeshi, H.; Vakaj, E.; Cooper, E.; Mohammad, R. An Ensemble-Learning-Based Technique for Bimodal Sentiment Analysis. Big Data Cogn. Comput. 2023, 7, 85. [Google Scholar] [CrossRef]
  15. Tang, J.; Hou, M.; Jin, X.; Zhang, J.; Zhao, Q.; Kong, W. Tree-Based Mix-Order Polynomial Fusion Network for Multimodal Sentiment Analysis. Systems 2023, 11, 44. [Google Scholar] [CrossRef]
  16. Khamphakdee, N.; Seresangtakul, P. An Efficient Deep Learning for Thai Sentiment Analysis. Data 2023, 8, 90. [Google Scholar] [CrossRef]
  17. Jo, A.-H.; Kwak, K.-C. Speech Emotion Recognition Based on Two-Stream Deep Learning Model Using Korean Audio Information. Appl. Sci. 2023, 13, 2167. [Google Scholar] [CrossRef]
  18. Abdulrahman, A.; Baykara, M.; Alakus, T.B. A Novel Approach for Emotion Recognition Based on EEG Signal Using Deep Learning. Appl. Sci. 2022, 12, 10028. [Google Scholar] [CrossRef]
  19. Middya, A.I.; Nag, B.; Roy, S. Deep Learning Based Multimodal Emotion Recognition Using Model-Level Fusion of Audio–Visual Modalities. Knowl. Based Syst. 2022, 244, 108580. [Google Scholar] [CrossRef]
  20. Ali, M.; Mosa, A.H.; Al Machot, F.; Kyamakya, K. EEG-Based Emotion Recognition Approach for e-Healthcare Applications. In Proceedings of the 2016 Eighth International Conference on Ubiquitous and Future Networks (ICUFN), Vienna, Austria, 5–8 July 2016; pp. 946–950. [Google Scholar]
  21. Zepf, S.; Hernandez, J.; Schmitt, A.; Minker, W.; Picard, R.W. Driver Emotion Recognition for Intelligent Vehicles. ACM Comput. Surv. (CSUR) 2020, 53, 1–30. [Google Scholar] [CrossRef]
  22. Zaman, K.; Zhaoyun, S.; Shah, B.; Hussain, T.; Shah, S.M.; Ali, F.; Khan, U.S. A Novel Driver Emotion Recognition System Based on Deep Ensemble Classification. Complex. Intell. Syst. 2023, 9, 6927–6952. [Google Scholar] [CrossRef]
  23. Du, Y.; Crespo, R.G.; Martínez, O.S. Human Emotion Recognition for Enhanced Performance Evaluation in E-Learning. Prog. Artif. Intell. 2022, 12, 199–211. [Google Scholar] [CrossRef]
  24. Alaei, A.; Wang, Y.; Bui, V.; Stantic, B. Target-Oriented Data Annotation for Emotion and Sentiment Analysis in Tourism Related Social Media Data. Future Internet 2023, 15, 150. [Google Scholar] [CrossRef]
  25. Caratù, M.; Brescia, V.; Pigliautile, I.; Biancone, P. Assessing Energy Communities’ Awareness on Social Media with a Content and Sentiment Analysis. Sustainability 2023, 15, 6976. [Google Scholar] [CrossRef]
  26. Bota, P.J.; Wang, C.; Fred, A.L.N.; Placido Da Silva, H. A Review, Current Challenges, and Future Possibilities on Emotion Recognition Using Machine Learning and Physiological Signals. IEEE Access 2019, 7, 140990–141020. [Google Scholar] [CrossRef]
  27. Egger, M.; Ley, M.; Hanke, S. Emotion Recognition from Physiological Signal Analysis: A Review. Electron. Notes Theor. Comput. Sci. 2019, 343, 35–55. [Google Scholar] [CrossRef]
  28. Shu, L.; Xie, J.; Yang, M.; Li, Z.; Li, Z.; Liao, D.; Xu, X.; Yang, X. A Review of Emotion Recognition Using Physiological Signals. Sensors 2018, 18, 2074. [Google Scholar] [CrossRef]
  29. Canal, F.Z.; Müller, T.R.; Matias, J.C.; Scotton, G.G.; de Sa Junior, A.R.; Pozzebon, E.; Sobieranski, A.C. A Survey on Facial Emotion Recognition Techniques: A State-of-the-Art Literature Review. Inf. Sci. 2022, 582, 593–617. [Google Scholar] [CrossRef]
  30. Assabumrungrat, R.; Sangnark, S.; Charoenpattarawut, T.; Polpakdee, W.; Sudhawiyangkul, T.; Boonchieng, E.; Wilaiprasitporn, T. Ubiquitous Affective Computing: A Review. IEEE Sens. J. 2022, 22, 1867–1881. [Google Scholar] [CrossRef]
  31. Schmidt, P.; Reiss, A.; Dürichen, R.; Laerhoven, K. Van Wearable-Based Affect Recognition—A Review. Sensors 2019, 19, 4079. [Google Scholar] [CrossRef]
  32. Rouast, P.V.; Adam, M.T.P.; Chiong, R. Deep Learning for Human Affect Recognition: Insights and New Developments. IEEE Trans. Affect. Comput. 2021, 12, 524–543. [Google Scholar] [CrossRef]
  33. Ahmed, N.; Aghbari, Z.A.; Girija, S. A Systematic Survey on Multimodal Emotion Recognition Using Learning Algorithms. Intell. Syst. Appl. 2023, 17, 200171. [Google Scholar] [CrossRef]
  34. Kitchenham, B. Procedures for Performing Systematic Reviews; Keele University: Keele, UK, 2004; Volume 33, pp. 1–26. [Google Scholar]
  35. Mollahosseini, A.; Hasani, B.; Mahoor, M.H. AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. IEEE Trans. Affect. Comput. 2019, 10, 18–31. [Google Scholar] [CrossRef]
  36. Al Jazaery, M.; Guo, G. Video-Based Depression Level Analysis by Encoding Deep Spatiotemporal Features. IEEE Trans. Affect. Comput. 2021, 12, 262–268. [Google Scholar] [CrossRef]
  37. Kollias, D.; Zafeiriou, S. Exploiting Multi-CNN Features in CNN-RNN Based Dimensional Emotion Recognition on the OMG in-the-Wild Dataset. IEEE Trans. Affect. Comput. 2021, 12, 595–606. [Google Scholar] [CrossRef]
  38. Li, S.; Deng, W. A Deeper Look at Facial Expression Dataset Bias. IEEE Trans. Affect. Comput. 2022, 13, 881–893. [Google Scholar] [CrossRef]
  39. Kulkarni, K.; Corneanu, C.A.; Ofodile, I.; Escalera, S.; Baro, X.; Hyniewska, S.; Allik, J.; Anbarjafari, G. Automatic Recognition of Facial Displays of Unfelt Emotions. IEEE Trans. Affect. Comput. 2021, 12, 377–390. [Google Scholar] [CrossRef]
  40. Punuri, S.B.; Kuanar, S.K.; Kolhar, M.; Mishra, T.K.; Alameen, A.; Mohapatra, H.; Mishra, S.R. Efficient Net-XGBoost: An Implementation for Facial Emotion Recognition Using Transfer Learning. Mathematics 2023, 11, 776. [Google Scholar] [CrossRef]
  41. Mukhiddinov, M.; Djuraev, O.; Akhmedov, F.; Mukhamadiyev, A.; Cho, J. Masked Face Emotion Recognition Based on Facial Landmarks and Deep Learning Approaches for Visually Impaired People. Sensors 2023, 23, 1080. [Google Scholar] [CrossRef]
  42. Babu, E.K.; Mistry, K.; Anwar, M.N.; Zhang, L. Facial Feature Extraction Using a Symmetric Inline Matrix-LBP Variant for Emotion Recognition. Sensors 2022, 22, 8635. [Google Scholar] [CrossRef]
  43. Mustafa Hilal, A.; Elkamchouchi, D.H.; Alotaibi, S.S.; Maray, M.; Othman, M.; Abdelmageed, A.A.; Zamani, A.S.; Eldesouki, M.I. Manta Ray Foraging Optimization with Transfer Learning Driven Facial Emotion Recognition. Sustainability 2022, 14, 14308. [Google Scholar] [CrossRef]
  44. Bisogni, C.; Cimmino, L.; De Marsico, M.; Hao, F.; Narducci, F. Emotion Recognition at a Distance: The Robustness of Machine Learning Based on Hand-Crafted Facial Features vs Deep Learning Models. Image Vis. Comput. 2023, 136, 104724. [Google Scholar] [CrossRef]
  45. Sun, Q.; Liang, L.; Dang, X.; Chen, Y. Deep Learning-Based Dimensional Emotion Recognition Combining the Attention Mechanism and Global Second-Order Feature Representations. Comput. Electr. Eng. 2022, 104, 108469. [Google Scholar] [CrossRef]
  46. Sudha, S.S.; Suganya, S.S. On-Road Driver Facial Expression Emotion Recognition with Parallel Multi-Verse Optimizer (PMVO) and Optical Flow Reconstruction for Partial Occlusion in Internet of Things (IoT). Meas. Sens. 2023, 26, 100711. [Google Scholar] [CrossRef]
  47. Barra, P.; De Maio, L.; Barra, S. Emotion Recognition by Web-Shaped Model. Multimed. Tools Appl. 2023, 82, 11321–11336. [Google Scholar] [CrossRef]
  48. Bhattacharya, A.; Choudhury, D.; Dey, D. Edge-Enhanced Bi-Dimensional Empirical Mode Decomposition-Based Emotion Recognition Using Fusion of Feature Set. Soft Comput. 2018, 22, 889–903. [Google Scholar] [CrossRef]
  49. Lucey, P.; Cohn, J.F.; Kanade, T.; Saragih, J.; Ambadar, Z.; Matthews, I. The Extended Cohn-Kanade Dataset (CK+): A Complete Dataset for Action Unit and Emotion-Specified Expression. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, San Francisco, CA, USA, 13–18 June 2010; pp. 94–101. [Google Scholar]
  50. Zhao, G.; Huang, X.; Taini, M.; Li, S.Z.; Pietikäinen, M. Facial Expression Recognition from Near-Infrared Videos. Image Vis. Comput. 2011, 29, 607–619. [Google Scholar] [CrossRef]
  51. Barros, P.; Churamani, N.; Lakomkin, E.; Siqueira, H.; Sutherland, A.; Wermter, S. The OMG-Emotion Behavior Dataset. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–7. [Google Scholar]
  52. Ullah, Z.; Qi, L.; Hasan, A.; Asim, M. Improved Deep CNN-Based Two Stream Super Resolution and Hybrid Deep Model-Based Facial Emotion Recognition. Eng. Appl. Artif. Intell. 2022, 116, 105486. [Google Scholar] [CrossRef]
  53. Zheng, W.; Zong, Y.; Zhou, X.; Xin, M. Cross-Domain Color Facial Expression Recognition Using Transductive Transfer Subspace Learning. IEEE Trans. Affect. Comput. 2018, 9, 21–37. [Google Scholar] [CrossRef]
  54. Tan, K.L.; Lee, C.P.; Lim, K.M. RoBERTa-GRU: A Hybrid Deep Learning Model for Enhanced Sentiment Analysis. Appl. Sci. 2023, 13, 3915. [Google Scholar] [CrossRef]
  55. Ren, M.; Huang, X.; Li, W.; Liu, J. Multi-Loop Graph Convolutional Network for Multimodal Conversational Emotion Recognition. J. Vis. Commun. Image Represent. 2023, 94, 103846. [Google Scholar] [CrossRef]
  56. Mai, S.; Hu, H.; Xu, J.; Xing, S. Multi-Fusion Residual Memory Network for Multimodal Human Sentiment Comprehension. IEEE Trans. Affect. Comput. 2022, 13, 320–334. [Google Scholar] [CrossRef]
  57. Yang, L.; Jiang, D.; Sahli, H. Integrating Deep and Shallow Models for Multi-Modal Depression Analysis—Hybrid Architectures. IEEE Trans. Affect. Comput. 2021, 12, 239–253. [Google Scholar] [CrossRef]
  58. Mocanu, B.; Tapu, R.; Zaharia, T. Multimodal Emotion Recognition Using Cross Modal Audio-Video Fusion with Attention and Deep Metric Learning. Image Vis. Comput. 2023, 133, 104676. [Google Scholar] [CrossRef]
  59. Noroozi, F.; Marjanovic, M.; Njegus, A.; Escalera, S.; Anbarjafari, G. Audio-Visual Emotion Recognition in Video Clips. IEEE Trans. Affect. Comput. 2019, 10, 60–75. [Google Scholar] [CrossRef]
  60. Davison, A.K.; Lansley, C.; Costen, N.; Tan, K.; Yap, M.H. SAMM: A Spontaneous Micro-Facial Movement Dataset. IEEE Trans. Affect. Comput. 2018, 9, 116–129. [Google Scholar] [CrossRef]
  61. Happy, S.L.; Routray, A. Fuzzy Histogram of Optical Flow Orientations for Micro-Expression Recognition. IEEE Trans. Affect. Comput. 2019, 10, 394–406. [Google Scholar] [CrossRef]
  62. Schmidt, P.; Reiss, A.; Duerichen, R.; Marberger, C.; Van Laerhoven, K. Introducing WESAD, a Multimodal Dataset for Wearable Stress and Affect Detection. In Proceedings of the Proceedings of the 20th ACM International Conference on Multimodal Interaction, New York, NY, USA, 2 October 2018; ACM: New York, NY, USA, 2018; pp. 400–408. [Google Scholar]
  63. Miranda-Correa, J.A.; Abadi, M.K.; Sebe, N.; Patras, I. AMIGOS: A Dataset for Affect, Personality and Mood Research on Individuals and Groups. IEEE Trans. Affect. Comput. 2021, 12, 479–493. [Google Scholar] [CrossRef]
  64. Subramanian, R.; Wache, J.; Abadi, M.K.; Vieriu, R.L.; Winkler, S.; Sebe, N. ASCERTAIN: Emotion and Personality Recognition Using Commercial Sensors. IEEE Trans. Affect. Comput. 2018, 9, 147–160. [Google Scholar] [CrossRef]
  65. Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.-S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A Database for Emotion Analysis; Using Physiological Signals. IEEE Trans. Affect. Comput. 2012, 3, 18–31. [Google Scholar] [CrossRef]
  66. Zhang, Y.; Cheng, C.; Wang, S.; Xia, T. Emotion Recognition Using Heterogeneous Convolutional Neural Networks Combined with Multimodal Factorized Bilinear Pooling. Biomed. Signal Process Control 2022, 77, 103877. [Google Scholar] [CrossRef]
  67. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. Available online: https://www.prisma-statement.org/prisma-2020-statement (accessed on 12 August 2024).
Figure 1. Systematic literature review process following the guidelines of Kitchenham B. [34].
Figure 1. Systematic literature review process following the guidelines of Kitchenham B. [34].
Applsci 14 07165 g001
Figure 2. Studies’ search process following the guidelines of Kitchenham B. [34].
Figure 2. Studies’ search process following the guidelines of Kitchenham B. [34].
Applsci 14 07165 g002
Figure 3. Quantity of articles from each category published from 2018 to 2023.
Figure 3. Quantity of articles from each category published from 2018 to 2023.
Applsci 14 07165 g003
Figure 4. Quantity of articles from unimodal categories.
Figure 4. Quantity of articles from unimodal categories.
Applsci 14 07165 g004
Figure 5. Quantity of articles classified to the multi-physical category published from 2018 to 2023.
Figure 5. Quantity of articles classified to the multi-physical category published from 2018 to 2023.
Applsci 14 07165 g005
Figure 6. Quantity of articles classified as multi-physiological published from 2018 to 2023.
Figure 6. Quantity of articles classified as multi-physiological published from 2018 to 2023.
Applsci 14 07165 g006
Figure 7. Quantity of articles classified as multi-physical–physiological published from 2018 to 2023.
Figure 7. Quantity of articles classified as multi-physical–physiological published from 2018 to 2023.
Applsci 14 07165 g007
Table 1. Inclusion and exclusion criteria.
Table 1. Inclusion and exclusion criteria.
DatabaseResulted Studies with Key TermsAfter Years FilterAfter Article TypeRelevant Order
IEEE21121152536200
Springer412118081694200
Science Direct1041582480200
MDPI686643635200
Table 2. Quantity of articles from each database that have successfully undergone quality assessment.
Table 2. Quantity of articles from each database that have successfully undergone quality assessment.
DatabaseQuantity
IEEE148
Springer112
Science Direct166
MDPI183
Table 3. Quantity of articles from each category along with their respective publication years.
Table 3. Quantity of articles from each category along with their respective publication years.
Modality201820192020202120222023Total
Multi-physical86 8222771
Multi-physical–physiological2 36718
Multi-physiological2 636421
Unimodal37262937176194499
Total49323551210232609
Table 4. Articles focused on unimodal facial emotion recognition and the databases they used.
Table 4. Articles focused on unimodal facial emotion recognition and the databases they used.
Article TitleDatabases UsedRef.
AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild.AffectNet[35]
Video-Based Depression Level Analysis by Encoding Deep Spatiotemporal Features.AVEC2013, AVEC2014[36]
Exploiting Multi-CNN Features in CNN-RNN Based Dimensional Emotion Recognition on the OMG in-the-Wild Dataset.Aff-Wild, Aff-Wild2, OMG[37]
A Deeper Look at Facial Expression Dataset Bias.CK+, JAFFE, MMI, Oulu-CASIA, AffectNet, FER2013, RAF-DB 2.0, SFEW 2.0[38]
Automatic Recognition of Facial Displays of Unfelt Emotions.CK+, OULU-CASIA, BP4D[39]
Spatio-Temporal Encoder-Decoder Fully Convolutional Network for Video-Based Dimensional Emotion Recognition.OMG, RECOLA, SEWA[6]
Efficient Net-XGBoost: An Implementation for Facial Emotion Recognition Using Transfer Learning.CK+, FER2013, JAFFE, KDEF[40]
Masked Face Emotion Recognition Based on Facial Landmarks and Deep Learning Approaches for Visually Impaired People.AffectNet[41]
Facial Feature Extraction Using a Symmetric Inline Matrix-LBP Variant for Emotion Recognition.JAFFE[42]
Manta Ray Foraging Optimization with Transfer Learning Driven Facial Emotion Recognition.CK+, FER-2013[43]
Emotion recognition at a distance: The robustness of machine learning based on hand-crafted facial features vs deep learning models.CK+[44]
Deep learning-based dimensional emotion recognition combining the attention mechanism and global second-order feature representations.AffectNet[45]
On-road driver facial expression emotion recognition with parallel multi-verse optimizer (PMVO) and optical flow reconstruction for partial occlusion in internet of things (IoT).CK+, KMU-FED[46]
Emotion recognition by web-shaped model.CK+, KDEF[47]
Edge-enhanced bi-dimensional empirical mode decomposition-based emotion recognition using fusion of feature seteNTERFACE, CK, JAFFE[48]
A novel driver emotion recognition system based on deep ensemble classificationAffectNet, CK+, DFER, FER-2013, JAFFE, and custom- dataset)[22]
Table 5. Trending topics in facial emotion recognition studies.
Table 5. Trending topics in facial emotion recognition studies.
1.Facial emotion recognition for mental health assessment (depression, schizophrenia)14. Emotion recognition performance assessment from faces acquired at a distance.
2. Emotion analysis in human-computer interaction15. Facial emotion recognition for IoT and edge devices
3. Emotion recognition in the context of autism16. Idiosyncratic bias in emotion recognition
4. Driver emotion recognition for intelligent vehicles17. Emotion recognition in socially assistive robots
5. Assessment of emotional engagement in learning environments18. In the wild facial emotion recognition
6. Facial emotion recognition for apparent personality trait analysis19. Video-based emotion recognition
7. Facial emotion recognition for gender, age, and ethnicity estimation20. Spatio-temporal emotion recognition in videos
8. Emotion recognition in virtual reality and smart homes21. Spontaneous emotion recognition
9. Emotion recognition in healthcare and clinical settings22. Emotion recognition using facial components
10. Emotion recognition in real-world and COVID-19 masked scenarios23. Comparing emotion recognition from genuine and unfelt
11. Personalized and group-based emotion recognitionfacial expressions.
12. Music-enhanced emotion recognition
13. Cross-dataset emotion recognition
Table 6. Databases most commonly utilized in the multi-physical classified studies incorporated into this review.
Table 6. Databases most commonly utilized in the multi-physical classified studies incorporated into this review.
Database NameDescriptionAdvantagesLimitation
MELD (Multimodal Emotion Lines Dataset)
[14]
Focuses on emotion recognition in movie dialogues. It contains transcriptions of dialogues and their corresponding audio and video tracks. Emotions are labeled at the sentence and speaker levels.Large amount of data, multimodal (text, audio, video).Emotions induced by movies. Manually labeled.
IEMOCAP (Interactive Emotional Dyadic Motion Capture), 2005
[55]
Focuses on emotional interactions between two individuals during acting sessions. It contains video and audio recordings of actors performing emotional scenes.Realistic data, emotional interactions, a wide range of emotions.Not real induced emotions (acting).
CMU-MOSI (Multimodal Corpus of Sentiment Intensity. 2014, 2017
[56]
Focuses on sentiment intensity in speeches and interviews. It includes transcriptions of audio and video, along with sentiment annotations. Updated in the 2017 CMU-MOSEI.Emotions are derived from real speeches and interviews.Relatively small size.
AVEC (Affective Behavior in the Context of E-Learning with Social Signals 2007–2016
[57]
AVEC is a series of competitions focused on the detection of emotions and behaviors in the context of online learning. It includes video and audio data of students participating in e-learning activities.Emotions are naturally induced during online learning activities.Context-specific data, enables emotion assessment in e-learning settings.
RAVDESS (The Ryerson Audio-Visual Database of Emotional Speech and Song) 2016
[58]
Audio and video database that focuses on emotion recognition in speech and song. It includes performances by actors expressing various emotions.Diverse data in terms of emotions, modalities, and contexts.Does not contain natural dialogues.
SAVEE (Surrey Audio–Visual Expressed Emotion) 2010
[59]
Focuses on emotion recognition in speech. It contains recordings of speakers expressing emotions through phrases and words.Clean audio data.
SAMM (Spontaneous Micro-expression Dataset)
[60]
Focuses on spontaneous micro-expressions that last only a fraction of a second. It contains videos of people expressing emotions in real emotional situations.Real spontaneous micro-expressions.
CASME (Chinese Academy of Sciences Micro-Expression)
[61]
Focus on the detection of micro-expressions in response to emotional stimuli. They contain videos of micro-expressions.Induced by emotional stimuli.Not multicultural.
Table 7. Databases utilized in the studies classified as multi-physiological and multi-physical–physiological in this review.
Table 7. Databases utilized in the studies classified as multi-physiological and multi-physical–physiological in this review.
Database NameDescriptionAdvantagesLimitation
WESAD (Wearable Stress and Affect Detection)
[62]
It focuses on stress and affect recognition from physiological signals like ECG, EMG, and EDA, as well as motion signals from accelerometers. Data were collected while participants performed tasks and experienced emotions in a controlled laboratory setting, wearing wearable sensors.Facilitates the development of wearable emotion recognition systems.The dataset is relatively small, and participant diversity may be limited.
AMIGOS
[63]
It is a multimodal dataset for personality traits and mood. Emotions are induced by emotional videos in two social contexts: one with individual viewers and one with groups of viewers. Participants’ EEG, ECG, and GSR signals were recorded using wearable sensors. Frontal HD videos and full-body videos in RGB and depth were also recorded.Participants’ emotions were scored by self-assessment of valence, arousal, control, familiarity, liking, and basic emotions felt during the videos, as well as external assessments of valence and arousal.Reduced number of participants.
DREAMER
[13]
Records physiological ECG, EMG, and EDA signals and self-reported emotional responses. Collected during the presentation of emotional video clips.Enables the study of emotional responses in a controlled environment and their comparison with self-reported emotions.Emotions may be biased towards those induced by video clips, and the dataset size is limited.
ASCERTAIN [64]Focus on linking personality traits and emotional states through physiological responses like EEG, ECG, GSR, and facial activity data while participants watched emotionally charged movie clips. Suitable for studying emotions in stressful situations and their impact on human activity.The variety of emotions induced is limited.
DEAP (Database for Emotion Analysis using Physiological Signals), [65,66]Includes physiological signals like EEG, ECG, EMG, and EDA, as well as audiovisual data.
Data were collected by exposing participants to audiovisual stimuli designed to elicit various emotions.
Provides a diverse range of emotions and physiological data for emotion analysis.The size of the database is small.
MAHNOB-HCI (Multimodal Human Computer Interaction Database for Affect Analysis and Recognition)
[13,66].
Includes multimodal data, such as audio, video, physiological, ECG, EDA, and kinematic data.
Data were collected while participants engaged in various human–computer interaction scenarios.
Offers a rich dataset for studying emotional responses during interactions with technology.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

García-Hernández, R.A.; Luna-García, H.; Celaya-Padilla, J.M.; García-Hernández, A.; Reveles-Gómez, L.C.; Flores-Chaires, L.A.; Delgado-Contreras, J.R.; Rondon, D.; Villalba-Condori, K.O. A Systematic Literature Review of Modalities, Trends, and Limitations in Emotion Recognition, Affective Computing, and Sentiment Analysis. Appl. Sci. 2024, 14, 7165. https://doi.org/10.3390/app14167165

AMA Style

García-Hernández RA, Luna-García H, Celaya-Padilla JM, García-Hernández A, Reveles-Gómez LC, Flores-Chaires LA, Delgado-Contreras JR, Rondon D, Villalba-Condori KO. A Systematic Literature Review of Modalities, Trends, and Limitations in Emotion Recognition, Affective Computing, and Sentiment Analysis. Applied Sciences. 2024; 14(16):7165. https://doi.org/10.3390/app14167165

Chicago/Turabian Style

García-Hernández, Rosa A., Huizilopoztli Luna-García, José M. Celaya-Padilla, Alejandra García-Hernández, Luis C. Reveles-Gómez, Luis Alberto Flores-Chaires, J. Ruben Delgado-Contreras, David Rondon, and Klinge O. Villalba-Condori. 2024. "A Systematic Literature Review of Modalities, Trends, and Limitations in Emotion Recognition, Affective Computing, and Sentiment Analysis" Applied Sciences 14, no. 16: 7165. https://doi.org/10.3390/app14167165

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop