Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols

Yang, Tao; Liu, Ziyu; Lu, Yu; Zhang, Jun

doi:10.3390/electronics12153332

Open AccessArticle

Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols

¹

Education Information Technology Center, China West Normal University, Nanchong 637002, China

²

School of Electronic and Information Engineering, China West Normal University, Nanchong 637002, China

³

School of Computer Science, China West Normal University, Nanchong 637002, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(15), 3332; https://doi.org/10.3390/electronics12153332

Submission received: 18 June 2023 / Revised: 2 August 2023 / Accepted: 2 August 2023 / Published: 3 August 2023

(This article belongs to the Special Issue Advances in Intelligent Data Analysis and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Heterogeneous corpora including Chinese, English, and emoji symbols are increasing on platforms. Previous sentiment analysis models are unable to calculate emotional scores of heterogeneous corpora. They also struggle to effectively fuse emotional tendencies of these corpora with the emotional fluctuation, generating low accuracy of tendency prediction and score calculation. For these problems, this paper proposes a Centrifugal Navigation-Based Emotional Computation framework (CNEC). CNEC adopts Emotional Orientation of Related Words (EORW) to calculate scores of unknown Chinese/English words and emoji symbols. In EORW, t neighbor words of the predicted sample from one element in the short text are selected from a sentiment dictionary according to spatial distance, and related words are extracted using the emotional dominance principle from the t neighbor words. Emotional scores of related words are fused to calculate scores of the predicted sample. Furthermore, CNEC utilizes Centrifugal Navigation-Based Emotional Fusion (CNEF) to achieve the emotional fusion of heterogeneous corpora. In CNEF, how the emotional fluctuation occurs is illustrated by the trigger angle of centrifugal motion in physical theory. In light of the corresponding relationship between the trigger angle and conditions of the emotional fluctuation, the fluctuation position is determined. Lastly, emotional fusion with emotional fluctuation is carried out by a CNEF function, which considers the fluctuation position as a significant position. Experiments demonstrate that the proposed CNEC effectively computes emotional scores for bilingual short texts with emojis on the Weibo dataset collected.

Keywords:

bilingual short text with emoji; emotional fusion; emotional fluctuation; emotional computation

1. Introduction

Sentiment analysis (SA), also referred to as opinion mining, is an academic field aimed at extracting users’ views, attitudes, and emotions towards events of interest using specific rules and techniques applied to textual data. Sentiment analysis terminology was first mentioned in 2003, and emotional features related to specific topics from documents were extracted by Nasukawa and Yi [1]. The tasks of sentiment analysis are primarily classified into three levels: document-level, sentence-level, and aspect-level analysis. Additionally, the objective of sentiment analysis for customer reviews is a binary classification task that aims to determine the polarity of opinions and belongs to the sentence-level sentiment analysis category [2]. The task of multi-class text categorization [3] was presented, equal to the rating-inference problem. In addition, the focus of aspect-based sentiment analysis [4] is on the aspect words in the sentence, assigning polarity to all the aspects.

In the current situation of data expansion and AI agglomeration, SA as a part of AI is applied to the various walks of life. Sentiment analysis is utilized on movie reviews to perform fine-grained analysis to determine both the sentiment orientation and sentiment strength of the reviewer towards various aspects of a movie [4]. Besides, the identification and extraction of covert social networks represents a critical challenge within the realm of government security in the domain of e-government [5]. On the social platform, users’ opinions reflect on hot spot news or public messages. Moreover, the frequency of emojis in posted messages is increasing. Especially in response to hot events, posts containing emojis emerge one after another [6]. Furthermore, regarding emoji usage on social platforms, sarcasm detection is also a concern of researchers [7]. The architecture is trained using two embeddings, namely word and emoji embeddings, and combines an LSTM with the loss function of SVM for sarcasm detection.

While sentiment analysis has been widely studied, most researchers have focused on analyzing a single corpus. This paper, however, presents a novel frame CNEC to perform the sentiment analysis of examining bilingual and emoji-containing text on social media platforms. CNEC employs the Emotional Orientation of Related Words (EORW) technique designed in previous work [8] to calculate emotional scores of Chinese phrases or English words. Additionally, the emoji is mapped to its true meaning in the form of a word by LinkMap. It is designed to associate text and emojis. Following that, emotional score of the emoji is calculated by EORW, driven by CNEC. In addition, CNEC utilizes the centrifugal motion framework in physics to describe emotional fluctuations. Then, CNEC uses CNEF to fuse emotional scores of different corpora to obtain the emotional score of the predicted text. Based on Maximum Density Dominance, emotional scores of texts free of emotional fluctuations can also be handled. The contribution to the field is unique at present, setting our approach apart from the existing literature.

2. Related Work

In response to traditional machine learning methods, Bayesian and support vector machines [9] are commonly used for sentiment analysis. In addition, a feature set selection method for social network sentiment analysis based on information gain, bigram, and object-oriented extraction methods was introduced. In order to solve the problem of performance degradation caused by aspect-based methods that cannot reasonably adapt the general vocabulary to the context of aspect-based datasets, Mohammad Erfan Mowlaei et al. [10] proposed two extended methods of dictionary generation methods for aspect-oriented problems—statistical methods and their previously proposed genetic algorithms, which fused the above vocabulary with prominent static words to classify the aspects in the comments. The ALGA algorithm was proposed [11] to address the task of polarity classification of Weibo emotions. This algorithm constructs an adaptive emotional vocabulary and seeks to identify the optimal emotional vocabulary for the task. An aspect-based hybrid approach to sentiment analysis that integrates domain vocabulary and rules was proposed [12] to analyze the entities of intelligent application reviews, extract important aspects from comments, achieve sentiment classification, and finally produce summary results, in order to understand the needs and expectations of their customers. A double feed-forward neural network [13] was used to pass output layer information to a two-layer neural network to optimize and process information for emotional classification.

A scholarly approach was developed to emphasize sarcasm, as outlined in the study [7]. This approach employed an architectural framework that integrated two types of embeddings: word embeddings and emoji embeddings. The framework leveraged the power of LSTM (Long Short-Term Memory) in conjunction with a loss function derived from SVM (Support Vector Machines). A deep learning framework [14] was proposed for analyzing product reviews on the YouTube social media platform, which could automatically collect, filter, and analyze reviews of a specific product from YouTube. Aminu Da’u et al. [15] proposed a recommendation system for the time-consuming problem and accuracy problem generated in the process of aspect-based opinion mining in user comments, which adopts a deep learning method based on aspect-weighted opinion mining. This method uses deep learning methods to extract aspects of products and underlying weighted user opinions from review text, and fuse them into extended collaborative filtering (CF) technology to improve the recommendation system. With the intention of catching up with the speed of streaming data generated on social media platforms and analyzing users’ emotions on topics, Ajeet Ram Pathak et al. [16] proposed a theme-level sentiment analysis model based on deep learning. The proposed model uses the online latent semantic index of regularization constraints to extract topics at the sentence level, and then applies the topic-level attention mechanism to the LSTM network for sentiment analysis. An improved sentiment analysis method [17] was presented to classify sentence type using BiLSTM-CRF and CNN for different types of emotions. This method divided sentences into different types and then performed sentiment analysis on each type of sentence. Bin Liang et al. [18] proposed a SenticNet-based graph convolutional network to build graph neural networks by integrating sentiment knowledge in SenticNet to enhance the dependent graph of sentences. On this basis, the sentiment enhancement graph model considers the dependence between contextual words and aspect words and the emotional information between opinion words and aspects. Since the influence of contextual intersentence associations was considered, an aspect-level sentiment analysis model [19] was proposed with aspect-specific contextual position information, which could extract the influence of the contextual association of each sentence in the document on the aspect sentiment polarity of individual sentences. A two-way LSTM (ET-Bi-LSTM) [20] emotion analysis model was designed for expression text integration in order to accurately classify the emotion of microblog comments with emoticons in microblog social networks. A model for predicting sentiment polarity on social media, which incorporates an emoji-aware attention-based GRU network, was proposed [21]. Bi-LSTM-SNP [22] was designed to address the challenge of capturing contextual semantic correlation between aspect word and content words more effectively. Except for the backbone model of LSTM, a graph convolutional neural network model [23,24,25,26] was widely used on aspect-level sentiment analysis. Moreover, based on ensemble learning, the sentiment analysis task was accomplished by combining the Bi-LSTM and Graph Convolutional Neural Network (GCN) techniques [27]. In addition, there are some works using fuzzy theory to perform the sentiment analysis [28,29,30].

3. Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols

In Section 3.1, this paper analyzes problems of emotional bilingual texts with emojis and describes methods how to tackle the problems. Section 3.2 explains how to perform the emotional computation of unknown phrases and emoji symbols, adopting EORW. Section 3.3 depicts emotional fluctuations by the centrifugal motion and illustrates emotional computation of texts stuck in emotional fluctuations.

3.1. Problem Analysis and Basic Definition

Problem Analysis. For computing emotional score, the pipeline of the traditional model has five steps: (1) word segmentation; (2) word retrieval; (3) output words and their scores; (4) emotional fusion of all words; (5) output the emotion score of the text. There are two problems with the traditional model, as depicted in Figure 1.

Problem 1.

The sentiment dictionary cannot encompass the emotional scores of all words that belong to Chinese or English. In addition, the direct calculation of emotional scores for emojis is not feasible due to the absence of a specialized sentiment dictionary for emojis.

Problem 2.

Traditional emotional fusion adopts an average method [31] or maximum-value strategy [32], resulting in insufficiently accurate scores.

From what has been presented, this paper aims to address three issues: firstly, emotional computation for unknown words and emojis; secondly, describing the process of sentiment fluctuation; and thirdly, sentiment calculation for texts containing sentiment fluctuations. As for these issues, firstly, EORW is employed in this article by CNEC to compute emotional scores of unknown words. After the transformation of texts through LinkMap designed by this paper, the emotional scores of emojis’ textual forms are calculated using EORW. Then, inspired by the circular motion, the centrifugal motion process in circular motion is adopted by CNEC to portray emotional fluctuation. Finally, emotional scores of mixed texts stuck in emotional fluctuations are computed using emotional fusion by CNEF.

Basic Definition. For facilitating understanding and description, this subsection defines some parameters and functions.

Definition 1.

Emotional Dictionary Dic. Dic = {w₁, w₂, …, w_N}, where w_i represents the i-th word, and N = size of Dic.

Definition 2.

Unknown Word uw. According to Function Rtr(·), the score of uw is empty, as depicted in Equation (1).

R t r (u w, D i c) = e m p t y

(1)

Definition 3.

Word Vector V_i of w_i. V_i is calculated by a BERT-base model (The structure of the BERT-base model is illustrated in Figure 2), as shown in Equation (2).

V_{i} = B E R T e m b d (w_{i})

(2)

Definition 4.

Word Similarity. Sim(w_i, w_j) is utilized to compute word similarity between w_i and w_j in space. The value domain of Sim(w_i, w_j) ∈ [−1, 1]. The closer the result is to 1, the greater the word similarity.

S i m (w_{i}, w_{j}) = \frac{\sum_{i = 1}^{D} V_{i} * V_{j}}{\sqrt{\sum_{i = 1}^{D} {(V_{i})}^{2}} * \sqrt{\sum_{i = 1}^{D} {(V_{j})}^{2}}}

(3)

where D represents the dimension of word vectors.

Definition 5.

Neighbor Words Set, NW-Set. NW-Set stores the first t words with word similarity close to 1. NW-Set = {w₁, w₂, …, w_t}, t ∈ N+. Rank(·) function is to choose the t nearest words.

N W - S e t = \overset{t}{\overset{︷}{{R a n k (S i m (w_{i}, w_{j}))}}}

(4)

Definition 6.

Mutually Exclusive Subset, MES_i. MES₁ and MES₂ have opposite emotional tendencies. For example, when the tendency of MES₁ is positive, the tendency of MES₂ is negative. Conversely, when the tendency of MES₁ is negative, the tendency of MES₂ is positive.

{\begin{matrix} [M E S_{1} = {w_{1}^{1}, \dots, w_{1}^{x}}, t e n (w_{1}^{i}) = p o s i t i v e, i = 1, \dots, x] \cap [M E S_{2} = {w_{2}^{1}, \dots, w_{2}^{y}}, t e n (w_{2}^{i}) = n e g a t i v e, i = 1, \dots, y] \\ [M E S_{1} = {w_{1}^{1}, \dots, w_{1}^{x}}, t e n (w_{1}^{i}) = n e g a t i v e, i = 1, \dots, x] \cap [M E S_{2} = {w_{2}^{1}, \dots, w_{2}^{y}}, t e n (w_{2}^{i}) = p o s i t i v e, i = 1, \dots, y] \end{matrix}

(5)

where w_i^θ represents the θ-th word in the i-th MES. θ ∈ [1, x] or [1, y], and x, y < t. ten(·) denotes the tendency.

Definition 7.

Emotional Dominance. The set containing the largest quantity of words is considered as the emotionally dominant set—Related Words Set (RS).

R S = D o m (M E S_{1}, M E S_{2}) = {\begin{matrix} M E S_{1} (x > y) \\ M E S_{2} (x < y) \end{matrix}

(6)

where x represents the quantity of MES₁, and y represents the quantity of MES₂.

3.2. Emotional Computation of Unknown Words and Emojis

Definition 8.

EORW. Emotional Orientation of Related Words. According to Word Similarity, the NW-Set of uw can be computed. Then, based on Emotional Dominance, the Related Words Set RS is calculated. Ultimately, the emotional score of uw is fused by emotional scores of words in RS.

3.2.1. Emotional Computation of Unknown Words

The process for computing the emotional score of unknown vocabulary uw involves mapping its word vector E_uw in space using a BERT-base model, as illustrated in Equation (2), where the form of E_uw is E_uw = (x₁, x₂, …, x_D), x_i ∈ R and D represents the dimension of the word vector E_uw.

Step 1. The word embedding E_dic of the emotional dictionary Dic is also calculated by BERTembd(·), shown in Equation (2), where Dic_{w_n} represents words in Dic, and the form of E_dic is similar to that of E_uw.

Step 2. To get the k-group neighbor words set (NW-Set), the principle of spatial similarity is utilized to compute the similarity between uw and Dic_{w_n}.The principle of spatial similarity is denoted by Equation (3), where E_{uw_i} and E_{dic_i} here represent the components of vector E_uw and E_dic, respectively.

Step 3. The Dic_{w_n} are organized in a descending order based on the result of Equation (3), with the aim of identifying the top t words of Dic_{w_n}. t words considered as neighbors of uw are put into NW-Set, as shown in Equation (4), and this operation is repeated for t times.

Step 4. The form of NW-Set is shown as NW-Set = {w₁, …, w_t}. After getting the NW-Set, EORW is applied to identify w_i in NW-Set with similar emotional tendencies.

Step 5. If the emotional tendencies of the t words are consistent, the emotional score of the unknown word is the average of the k groups’ emotional scores. Conversely, if the k words’ emotional tendencies are inconsistent, MES_i can be calculated by Equation (5).

Step 6. From MESs, Dom(MES₁, MES₂) can cope with the dominant set RS, as shown in Equation (6).

Step 7. The emotional scores of w in RS are fused to calculate the final score S_uw of uw. S_uw is represented as Equation (7).

S_{u w} = {\begin{matrix} \frac{\sum_{i = 1}^{t} e_{i}}{t} (c o n s i s t e n t) \\ \frac{\sum_{i = 1}^{x} e_{i}}{x} (e x c l u s i v e) \end{matrix}

(7)

where e_i (i = 1, 2 …t) represents a score in a group of k words with consistent emotional tendencies, while e_i (i = 1, 2 …x) represents a score in a group of x scores in RS.

Figure 3 illustrates the comprehensive workflow of computing the emotional score of unknown words.

3.2.2. Emotional Computation of Emojis

Given that the emotional score of emojis can’t be directly calculated, it is necessary to rely on other methods. This paper adopts the EORW method to accomplish the goal of computing the emotional score of emojis. Based on EORW, an emoji is considered as a uw. Therefore, the meaning of emojis in textual form needs to be denoted. However, there are discrepancies between the Chinese shapes and English appearances of emojis emerging in textual data. In order to tackle this issue and make sentiment analysis easier, an emoji linking map LinkMap has been developed. Some emojis are shown in the following Table 1.

In the textual data, the form of a Chinese emoji C_emoji is shown in Equation (8), and similarly the shape of an English emoji E_emoji is depicted in Equation (9).

C_{e m o j i} = [C h_w]

(8)

where Ch_w represents short Chinese words, illustrating the meaning of the emoji.

E_{e m o j i} = : w_{i}_w_{j} :

(9)

where w_i_w_j illuminates that two words linked by underscore explain this emoji. w_i and w_j are English words.

Then, through LinkMap, emofMeaning can be denoted as Equation (10).

e m o f M e a n i n g = L i n k M a p (e m o j i)

(10)

where emofMeaning represents the English phrase of the emoji meaning.

Due to the operation of LinkMap, the result reveals the form of emoji in textual representation emofMeaning[uw]. Following that, the EORW technique is utilized. The progress of EORW can be checked in Section 3.2.

As depicted in Figure 4, based on LinkMap, the emoji ‘loudly crying face’ is equal to [crying] and ‘:Loudly_Crying_Face:’ and its meaning ‘tear’ computed. Subsequently, ‘tear’ is fed into the embedding layer, its vector calculated by a BERT-base in Equation (2). Then, through the EORW method, RS(crying) can be acquired. As a result, the emotional score of emoji ‘loudly crying face’ is calculated by the emotional fusion of w in RS(crying).

3.3. Emotional Computation of Bilingual Short Texts with Emojis in Emotional Fluctuation

This section describes how to compute the emotional score of the short text with emoji. According to Section 3.2.1, the scores of words that can’t be retrieved from the emotional dictionary are calculated by the EORW method. Then emotional fluctuation is considered.

Step 1. The input sentence S is fed into the model, and its internal components are segmented into three distinct elements: the Chinese element C, the English element E, and the emoji element e. S is represented by the following equation, Equation (11).

S = [{\overset{C}{\overset{︷}{w_{1}, \dots, w}}}_{i}, \overset{E}{\overset{︷}{w_{1}, \dots, w_{k}}}, \overset{e m o j i}{\overset{︷}{e}}]

(11)

where w represents words in S, i words belong to C, k words belong to E, and e denotes the emoji.

Step 2. C can be represented by Equation (12), and E can be represented in the same way by Equation (13).

C = [w_{1}, \dots, w_{i}]

(12)

E = [w_{1}, \dots, w_{k}]

(13)

Step 3. This article supposes that nouns N, verbs V, adjectives Adj, and adverbs Adv have a significant impact on S. Therefore, these words are retained and extracted into a keyword’s subset K. K has two forms in this paper: one is Kc and the other is Ke. Kc contains Chinese keywords, Ke contains English keywords, and Kc and Ke are respectively denoted as Equations (14) and (15).

K_{C} = [N_{C}, V_{C}, A d j_{C}, A d v_{C}]

(14)

K_{e} = [N_{e}, V_{e}, A d j_{e}, A d v_{e}]

(15)

where the subscripts C and e respectively represent C and E.

Step 4. emofMeaning is fed into the embedding layer, considered as English words using EORW to compute its emotional score.

Step 5. In C and E, after generating K, the emotional scores of words in K can be computed by the EORW technique, if these words don’t belong to the emotional dictionary. Then, this paper checks whether emotion is fluctuating. If a fluctuation position exists, CNEF is designed to calculate the emotional score of the sentence. In other words, more attention is paid to the fluctuation position, considered as the most significant position. The tendencies of other elements are considered as other elements M.

Emotional Fluctuation and emotional fusion are introduced in the following sections.

Emotional Fluctuation. There are definitions to describe the emotional fluctuation and illustrate the categories of different emotional fluctuations.

Definition 9.

Emotional Fluctuation. A sentence S = {element₁, element₂, element₃} is divided into three elements element₁, element₂, and element₃. Emotional fluctuation appears when the emotional tendencies of element₁, element₂, and element₃ are not consistent, such as (element₁, positive) (element₂, negative) ← emotional fluctuation appears in S or (element₁, positive) (element₂, positive) (element₃, negative) ← emotional fluctuation appears in S.

Definition 10.

Emotional Fluctuation Position and Normal Position. The position of emotional inconsistency is defined as the Emotional Fluctuation Position. Other positions are considered as the Normal Position. For example, S = (element₁, positive, Normal Position) (element₂, positive, Normal Position) (element₃, negative, Emotional Fluctuation Position).

The emotional fluctuations are classified into three categories by CNEC: (1) the second element is identified as the position of emotional fluctuation when the Chinese, English, and emoji demonstrate positive-negative-negative or negative-positive-positive emotional tendencies; (2) the last element is considered as the position of emotional fluctuation when the emotional tendencies of the Chinese, English, and emoji elements are negative-negative-positive or positive-positive-negative; (3) the emotions expressed in the text are generally reckoned to be consistent with the emotional tendencies of the beginning and end, when the emotional fluctuation occurs either at the beginning or end and the emotional tendencies of the beginning and end are consistent. These situations are shown in Figure 5.

Moreover, when confronted with fluctuation between two elements, a similar approach is taken as in the case of three elements. The primary adjustment is made to the element whose sentiment tendency has changed. In light of what has been previously presented, the fluctuation position Fl_P can be denoted as Equation (16). Equation (16) aligns with various situations depicted in Figure 5. Specifically, Case 1 corresponds to M2, Case 2 corresponds to M3, and Case 3 corresponds to M1 and M3. Furthermore, in situations where only two modules are present, Case 4 of Equation (16) pertains to M2.

F l_P = {\begin{matrix} M 2 (s g n (M 1) > 0 and s g n (M 2) < 0 and s g n (M 3) < 0) or (s g n (M 1) < 0 and s g n (M 2) > 0 and s g n (M 3) > 0) \\ M 3 (s g n (M 1) < 0 and s g n (M 2) < 0 and s g n (M 3) > 0) or (s g n (M 1) > 0 and s g n (M 2) > 0 and s g n (M 3) < 0) \\ M 1 & M 3 (s g n (M 1) > 0 and s g n (M 2) < 0 and s g n (M 3) > 0) or (s g n (M 1) < 0 and s g n (M 2) > 0 and s g n (M 3) < 0) \\ M 2 (s g n (M 1) > 0 and s g n (M 2) < 0)) or (s g n (M 1) < 0 and s g n (M 2) > 0)) \end{matrix}

(16)

where M_i represents the i-th element of three elements. Function sgn(·) stands for the sign of the emotional score.

Emotional Computation of Texts in Emotional Fluctuation. To elucidate the emotional fluctuation more effectively, a Centrifugal Navigation-Based Emotion Computation framework (CNEC) employs the centrifugal process of circular motion in physics to describe the phenomenon of emotional fluctuation. In the CNEC framework, m_f represents the emotional fluctuation element, which is analogous to an object m in circular motion. M illustrates other elements equal to the center of circular track, which is denoted as O in physics. In addition, R denotes the radius of the circle, which is determined by the distance between M and m_f. As M and m_f are not point particles in reality, their separation distance is considered to be the sum of their individual lengths. R can be represented by Equation (17).

R = l e n g t h_{x} + l e n g t h_{y}

(17)

where length_x represents the length of m_f, and length_y means the length of M.

When m_f undergoes uniform motion along a circular orbit, it illustrates that there is no emotional fluctuation between M and m_f. When there is emotional fluctuation between M and m_f, m_f, equivalent to the object m, moves in a centrifugal motion, departing from the circular orbit. The moment centrifugal motion occurs, the condition where the angle θ between the velocity direction of m_f and the line connecting m_f to the center O of the circle is greater than 90 degrees corresponds to Equation (16). In other words, the condition of emotional fluctuation is equal to the condition of centrifugal motion. As shown in Figure 6, an example of a bilingual text with an emoji that is in a fluctuation position illustrates the corresponding relationship between the workflow of emotional fluctuation and centrifugal motion in the CNEC framework.

S = “为你挑选了实用的礼物, 而你stupid 😫” (Translation: I select practical gifts for you, but you’re stupid 😫), a sentence with an emoji, is put into the Model. Then, this study carries out splitting the sentence S, and puts the Chinese, English, and emoji into different elements through Equations (12) and (13). The format of the segment = [C][E][C_emoji]. Using word tokenization of BERT, C is split into individual words w_i stored in a set CW, i < N, where N represents the length of C. Similarly, k words w_k are stored in a set EW, k < M, where M represents the length of E. Then, a library Jieba is utilized to mark the part of speech of w_i in CW. As a result, N_c, V_c, Adj_c_, and Adv_c are extracted from CW and saved into Kc. Moreover, the library NLTK is used to mark the part of speech of w_k in EW, so N_e, V_e, Adj_e, and Adv_e are extracted from EW into Ke. These conditions are shown in Equations (14) and (15). Later, Kc is fed into the embedding layer, the word vector set E_c fed into Equation (1) can be obtained, and the word vector set E_e obtained by Equation (1) in the same way.

On the basis of EORW,

S_{i}^{t}

can be calculated, which represents the emotional score of the i-th word with t neighbor words. Then the score S_Kc of Kc is fused by

S_{i}^{t}

, and the score S_Ke of Ke fused by

S_{i}^{t}

, as the following Equation (18).

S_{K_{x}} = \frac{\sum S_{i}^{t}}{ρ_{x}}

(18)

where ρ_c stands for the density of Kc that is equal to the length of Kc, S_Kc replaces S_Kx, and ρ_c replaces ρ_x. In addition, ρ_e stands for the density of Ke that is equal to the length of Ke, S_Ke replaces S_Kx, and ρ_e replaces ρ_e.

When 😫 enters sequentially Equations (9) and (10), its emofMeaning can be computed. After that, emofMeaning[tired] is fed into embedding layer, obtaining E_{emofMeaning[tired]}. Then, emoji’s score S_e can be calculated by EORW.

After collecting S_Kc, S_Ke, and S_e, the emotional fluctuation can be checked by Equation (16). While a fluctuation position exists, S_Kc represents the score of Kc corresponding to the score of M1, S_Ke represents the score of Ke corresponding to the score of M2, and S_e represents the score of e corresponding to the score of M3. Finally, the text score F_S can be calculated by Emotional Fusion, described in the next section.

As depicted in Table 2, each parameter of centrifugal motion corresponds to parameters in emotional fluctuation.

Centrifugal Navigation-Based Emotional Fusion. When fluctuation doesn’t exist, it means that the tendencies among C, E, and e are consistent. Therefore, the final score F_S of a short text S is computed by Maximum Density Dominance, as shown in Equation (19).

F_{S} = M a x (\frac{m_{i}}{ρ_{i}}), c o n s i s t e n t t e n d e n c y

(19)

where Function Max(·) determines the largest score of the three elements (m₁, m₂, m₃), and ρ_i represents the density of i-th element that is equal to the length of i-th element.

On the contrary, when one of emotional fluctuations emerges among the three elements, emotional fusion is utilized as shown in Equation (20).

F_{S} = \frac{s g n (m_{f}) * e^{s c o r e_{m_{f}}}}{\sum e^{s c o r e_{M}}}, i n c o n s i s t e n t t e n d e n c y

(20)

where Function sgn(·) extracts emotional tendency.

Workflow of emotional computation about two elements. To facilitate understanding and examination of two elements, an instance of a monolingual corpus containing the emoji is presented below, as demonstrated in Figure 7. S, an English sentence with the emoji, is put into the Model. Then, this study carries out splitting the sentence S, and puts the English and emoji into different groups. The format of the segment = [E][C_emoji]. Using word tokenization, E is split into individual words w_i stored in a set EW, i < N, where N represents the length of E. Then, NLTK is utilized to mark the part of speech of w_i in EW. As a result, N, V, Adj, and Adv are extracted from EW and saved into Ke. Moreover, C is empty, so Kc belongs to the null set Ø. Later, Ke is fed into embedding layer, word vectors set E_e fed into Equation (2) can be obtained.

On the basis of EORW,

S_{i}^{t}

can be calculated, which represents the emotional score of the i-th word with t neighbor words. Then the score S_Kc of Kc is fused by

S_{i}^{t}

as the following Equation (21).

S_{K_{x}} = \frac{\sum S_{i}^{t}}{ρ_{x}}

(21)

where ρ_e stands for the density of Ke that is equal to the length of Ke, S_Ke replaces S_Kx, and ρ_e replaces ρ_x.

When 😭 enters Equation (10), its emofMeaning can be computed. After that, emofMeaning[crying] is fed into embedding layer, obtaining E_{emofMeaning[crying]}. Then, the emoji’s score S_e can be calculated by EORW.

After collecting S_Kc and S_e, the emotional fluctuation can be checked by Equation (16). While a fluctuation position exists, S_Kc represents the score of Kc corresponding to the score of M1, and S_e represents the score of Ke corresponding to the score of M2. Finally, text score F_S can be calculated by Equation (20).

Conversely, if the fluctuation doesn’t exist, Equation (19) is adopted to compute the final score F_S.

4. Experiment

4.1. Dataset

To perform the computation of scores with expert knowledge, the sentiment dictionary with emotional scores is an essential tool. In this paper, the Boson dictionary including 114,767 words is utilized to retrieve emotional scores of words. In this dictionary, the format of data is [word][score]. The value range of scores is (−7, 7). In order to facilitate the calculation, scores in the dictionary are normalized to the region of [−1, 1].

We collect the short text dataset from Chinese platform WeiboWe imitate the dataset from github.com/SophonPlus/ChineseNlpCorpus to annotate our short text dataset, and extract the data that satisfies the task of this paper. Long texts over 50 words are deleted from the merged dataset, leaving about 903 texts with emoji. Data text has any combination of Chinese [C], English [E], and emojis [e], and the format of the texts = {[C, E, e] | [C, E] | [C, e] | [E, e]}, as shown in Table 3.

4.2. Experiment and Result

In this section, we perform two groups of experiments, containing the selection of t-value of EORW and Emotional Computation. Regarding the selection of t-value experiment, multiple iterations of experiments are conducted to determine the optimal value of t. In response to the experiment of Emotional Computation, the emotional scores of the sentences in the dataset are computed, and these scores are utilized to assess whether the tendencies align with the tendencies associated with the labels.

In the first experiment, two samples about words and the emoji about emotional computation are shown as follows.

As shown in Table 4, angry and fire are taken as examples (six decimal places are retained and t = 5). When uw is angry, NW-Set(angry) = {irate, enraged, indignant, incensed, annoyed}. Then, NW-Set(angry) is fed into Equation (4). However, MES_i(angry) = {Ø} on account of the fact that the emotional scores of w_i in NW-Set(angry) are consistent. As a result, S_uw(angry) can be calculated by the first case of Equation (6). Conversely, the emotional tendencies of neighbors of fire as uw are inconsistent. NW-Set(fire) = {firing, alarm, destroyed, fumes, fired}. It is noteworthy that the emotional tendency of “fumes” in NW-Set is different from others. Therefore, MES₁(fire) = {firing, alarm, destroyed, fired}, and MES₂(fire) = {fumes}. Based on Equation (5), RS(fire) = MES₁(fire). At last, S_uw(fire) can be computed by the second case of Equation (6).

In Table 5, neighbor means t neighbor words of emofMeaning “crying” from the emoji 😭, Emotion represents the emotional score of neighbors of emofMeaning “crying”, and S stands for the final score of emofMeaning “crying” that is equal to the emoji 😭.

t-value of EORW. To accomplish the goal of computing the emotional score of uw and verify the feasibility of EORW, this paper covers up the emotional score of w in Dic to compare the emotional score calculated by EORW with the score stored in Dic.

For facilitating the test, the threshold is set to 30%. When the threshold T is lower than 30%, the result is computed correctly. T is denoted as Equation (22).

T = \frac{| l a b e l - u w [s c o r e] |}{l a b e l} \times 100 %

(22)

In this paper, the accuracy of a group is determined by the ratio of correctly computed words to the total number of words. In this condition, the symbol n_ccw is used to denote the number of words computed correctly, while N represents the total number of words in a given experimental group. The accuracy Acc is denoted as shown in Equation (23).

\begin{array}{l} n_{c c w} = l e n ({w_{c c w} | T < 30 %}) \\ A cc = \frac{n_{c c w}}{N} \end{array}

(23)

In addition, this paper chooses a moderate t-value. Five hundred words uw[cover] are generated from the dictionary Dic. Ten groups of experiments are conducted with different t values, where t belongs to the set = [3, 5, 7, 9, 11, 13, 15].

As shown in Figure 8, when 300 words are randomly generated, the overall accuracy is between 74.3% and 85%. When t is 9, the upper bound of accuracy is about 82.7% and the lower bound of accuracy is about 79.3%. When the t-value is equal to 9, accuracy becomes stable. When 500 words are randomly generated, the overall accuracy is between 74.2% and 84%. When t is 9, the upper bound of accuracy is about 83.2% and the lower bound of accuracy is about 78%. From what has been presented, the upper and lower bounds of accuracy are closer than other t when t is 9, which means that EORW is the most stable in this condition.

Proportion of emoji usage. This paper presents a statistical analysis of the proportion of emoji usage in the collected dataset, as depicted in Figure 9. In the collected data, the utilization rate of ’loudly crying face’ emoji has the highest usage frequency, accounting for 21% of the total. In contrast, it is obvious from Figure 9 that the ‘kiss’ emoji has a very low usage frequency, representing only 1.13% of the total. Additionally, other infrequently used emojis have been grouped into the “others” category, which accounts for approximately 15.18% of the total emoji usage.

😭 in the bilingual sentence S is transformed to emofMeaning[crying]. The processing results of emotional scores of [crying] calculated are depicted in Table 5. As shown in Table 6, the emoji 😫 in the bilingual sentence S can be similarly computed.

Result of emotional computation. After eliminating disordered data, the remaining dataset consists of 903 texts where 601 data labels are positive and 302 data labels are negative. In addition, some marks with bad influence on the experiment are stripped from the dataset. Then, the dataset is fed into the computation model. As a result, the accuracy of the emotional computation reaches about 98.67%.

In the bilingual sentence S, Kc and Ke respectively harbor three keywords and one keyword. Each phrase has nine neighbors that have their own scores. The presence of a score above 0 indicates a positive tendency, while a score below 0 signifies a negative tendency. Through EORW, the dominant emotional set also named as RS can be computed. As a result, the emotional scores of the keywords can be calculated as shown in Table 7. As shown in Figure 10, the bilingual sentence S contains four keywords (Chinese phrases: 挑选 (translation: select); 实用的 (translation: practical); 礼物 (translation: gifts). English word: stupid).

As depicted in Figure 11, the English sentence S has four keywords (hope, become, good, friends) extracted from Ke. According to EORW, RS can be computed. As a result, the emotional scores of the keywords can be calculated as shown in Table 8.

According to Equation (22), the emotional score of the Chinese and English corpora can be respectively computed. Through Equation (16), fluctuation can be checked. Finally, in light of Equation (21), the result of the emotional fusion is computed. The emotional processing of the bilingual sentence and English sentence are noted in Table 9.

As demonstrated in Table 10, CNEC outperforms the average strategy by 1.11%. In addition, CNEC improves by 21.04% compared with the maximum-value method. In response to emotional fusion, CNEC has more competitive advantages. Furthermore, the experimental outcomes are compared by utilizing several prominent deep learning models that are frequently employed in the field, as shown in Table 11. The BERT model is built upon the BERT-base model, with a learning rate of 2 × 10⁻⁵, a batch size of 32, and a total of 5 training epochs. RoBERTa-base is adopted in this paper, with a learning rate of 2 × 10⁻⁵, a batch size of 16, and a total of 5 training epochs. In BERT_prompting, BERT-base is also the backbone of this model, with a learning rate of 2 × 10⁻⁵, a batch size of 8, and a total of 5 training epochs. According to prompt learning, the BERT_prompting’s template is designed as “How {} it was.”. For all methods, a cross-entropy loss is used for the loss function during training for the sentiment analysis, as shown in Equation (24).

L o s s = \frac{1}{N} \sum_{i} - [y_{i} \cdot \log (p_{i}) + (1 - y_{i}) \cdot \log (1 - p_{i})]

(24)

where N denotes the number of samples, y_i represents the true label of the i-th sample, and p_i represents its probability.

5. Discussion

This paper pays attention to the computation of emotion scores aiming at bilingual short texts with emoji symbols. According to traditional methods or some deep learning elements, they cannot tackle the problem of computing scores of bilingual data incorporating emoji symbols. Moreover, mainstream approaches of deep learning just compute the score of a single corpus based on their own rules. This study employs three deep learning models to conduct a comparative analysis with the suggested approach. The BERT model’s pre-training data primarily consists of a vast English corpus, which poses limitations when dealing with texts that contain multiple languages and emojis. Nonetheless, by incorporating a template into BERT, its expressive capabilities are enhanced by approximately 5%. Conversely, the RoBERTa model, which excludes the NSP task and incorporates a larger training corpus, exhibits superior performance when handling extremely short sentences. However, these approaches excel primarily in classification tasks and do not effectively utilize existing knowledge that contains emotional scores for emotion scoring. On the contrast, the CNEC framework relies on the professional knowledge of the emotional dictionary and utilizes the EORW method to improve the defects of the emotional dictionary on bilingual data. Furthermore, CNEC drives EORW to compute emotional scores of emoji symbols. In addition, CNEC can illustrate emotional fluctuation. In addition, based on CNEF, the emotional fusion is utilized to compute the emotional score of bilingual short texts with emoji. The reason for utilizing centrifugal motion in this paper to illustrate the emotional fluctuation is due to the consistency of emotions within each part of a short text. If the emotion of a part fluctuates, it is akin to an object in circular motion suddenly lacking the centripetal force necessary to maintain its inertia, and thus moving centrifugally. The reason why the accuracy of the experiment can reach 98.67% is that these emotional tendencies of short texts are more explicit than those of long texts. Additionally, this dataset from Weibo doesn’t contain implicit emotion such as emojis with positive tendency used to express negative tendency. As a result, emotional scores of emojis or phrases are stationary for every sentence where they exist. Furthermore, the quantity of datasets with a bilingual corpus with emojis is limited, and most data involve a single language with emojis. Besides, as emotion dictionaries are constructed based on specific knowledge categories, selecting different emotion dictionaries may result in situations where a phrase has different emotional scores, which in turn imposes specific constraints on the emotional score range of the phrase.

In the future, how to tackle different implicit meanings of phrases and emojis is to be considered as the most significant point. In addition, how to classify emotions at a fine-grained level is also a matter of great importance in future work. It is also worth studying the issue of harmonizing the grading differences among different dictionaries for various fields as much as possible.

Author Contributions

Conceptualization, T.Y. and Z.L.; methodology, T.Y. and Z.L.; software, T.Y. and Z.L.; validation, Z.L. and Y.L.; formal analysis, T.Y.; investigation, J.Z.; resources, Z.L.; data curation, T.Y. and Z.L.; writing—original draft preparation, Z.L.; writing—review and editing, T.Y.; supervision, T.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Sichuan Science and Technology Program under Grant No. 2022YFG0322, China Scholarship Council Program (Nos. 202001010001 and 202101010003) and the Innovation Team Funds of China West Normal University (No. KCXTD2022-3).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Nasukawa, T.; Yi, J. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd International Conference on Knowledge Capture, Sanibel Island, FL, USA, 23–25 October 2003; pp. 70–77. [Google Scholar]
Hu, M.; Liu, B. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, DC, USA, 22–25 August 2004; pp. 168–177. [Google Scholar]
Pang, B.; Lee, L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv 2005, arXiv:0506075. [Google Scholar]
Thet, T.T.; Na, J.C.; Khoo, C.S. Aspect-based sentiment analysis of movie reviews on discussion boards. J. Inf. Sci. 2010, 36, 823–848. [Google Scholar] [CrossRef]
Alguliyev, R.M.; Aliguliyev, R.M.; Niftaliyeva, G.Y. Extracting social networks from e-government by sentiment analysis of users’ comments. Electron. Gov. Int. J. 2019, 15, 91–106. [Google Scholar] [CrossRef]
Li, M.; Ch’ng, E.; Chong, A.Y.L.; See, S. Multi-class Twitter sentiment classification with emojis. Ind. Manag. Data Syst. 2018, 118, 582. [Google Scholar] [CrossRef]
Jain, D.K.; Kumar, A.; Sangwan, S.R. TANA: The amalgam neural architecture for sarcasm detection in indian indigenous language combining LSTM and SVM with word-emoji embeddings. Pattern Recognit. Lett. 2022, 160, 11–18. [Google Scholar] [CrossRef]
Yang, T.; Liu, Z.; Chen, Q.; Ma, X.; Deng, H. Emotional Computation of Unfamiliar Word based on Emotional Orientation of Related Words. In Proceedings of the 2nd International Conference on Artificial Intelligence, Big Data and Algorithms, Nanjing, China, 17–19 June 2022. [Google Scholar]
Le, B.; Nguyen, H. Twitter sentiment analysis using machine learning techniques. In Proceedings of the 3rd International Conference on Computer Science, Applied Mathematics and Applications-ICCSAMA, Hanoi, Vietnam, 19–20 December 2015; pp. 279–289. [Google Scholar]
Mowlaei, M.E.; Abadeh, M.S.; Keshavarz, H. Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst. Appl. 2020, 148, 113234. [Google Scholar] [CrossRef]
Keshavarz, H.; Abadeh, M.S. ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs. Knowl.-Based Syst. 2017, 122, 1–16. [Google Scholar] [CrossRef]
Alqaryouti, O.; Siyam, N.; Abdel Monem, A.; Shaalan, K. Aspect-based sentiment analysis using smart government review data. Appl. Comput. Inform. 2020. [Google Scholar] [CrossRef]
Revathy, G.; Alghamdi, S.A.; Alahmari, S.M.; Yonbawi, S.R.; Kumar, A.; Haq, M.A. Sentiment analysis using machine learning: Progress in the machine intelligence for data science. Sustain. Energy Technol. Assess. 2022, 53, 102557. [Google Scholar] [CrossRef]
Mai, L.; Le, B. Joint sentence and aspect-level sentiment analysis of product comments. Ann. Oper. Res. 2021, 300, 493–513. [Google Scholar] [CrossRef]
Da’u, A.; Salim, N.; Rabiu, I.; Osman, A. Weighted aspect-based opinion mining using deep learning for recommender system. Expert Syst. Appl. 2020, 140, 112871. [Google Scholar]
Pathak, A.R.; Pandey, M.; Rautaray, S. Topic-level sentiment analysis of social media data using deep learning. Appl. Soft Comput. 2021, 108, 107440. [Google Scholar] [CrossRef]
Chen, T.; Xu, R.; He, Y.; Wang, X. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 2017, 72, 221–230. [Google Scholar] [CrossRef] [Green Version]
Liang, B.; Su, H.; Gui, L.; Cambria, E.; Xu, R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl.-Based Syst. 2022, 235, 107643. [Google Scholar] [CrossRef]
Huang, B.; Guo, R.; Zhu, Y.; Fang, Z.; Zeng, G.; Liu, J.; Shi, Z. Aspect-level sentiment analysis with aspect-specific context position information. Knowl.-Based Syst. 2022, 243, 108473. [Google Scholar] [CrossRef]
Li, X.; Zhang, J.; Du, Y.; Zhu, J.; Fan, Y.; Chen, X. A novel deep learning-based sentiment analysis method enhanced with Emojis in microblog social networks. Enterp. Inf. Syst. 2023, 17, 2037160. [Google Scholar] [CrossRef]
Li, D.; Rzepka, R.; Ptaszynski, M.; Araki, K. Emoji-Aware Attention-based Bi-directional GRU Network Model for Chinese Sentiment Analysis. In Proceedings of the LaCATODA/BtG@ IJCAI, Macao, China, 24–29 August 2019; pp. 11–18. [Google Scholar]
Huang, Y.; Liu, Q.; Peng, H.; Wang, J.; Yang, Q.; Orellana-Martín, D. Sentiment classification using bidirectional LSTM-SNP model and attention mechanism. Expert Syst. Appl. 2023, 221, 119730. [Google Scholar] [CrossRef]
Gu, T.; Zhao, H.; He, Z.; Li, M.; Ying, D. Integrating external knowledge into aspect-based sentiment analysis using graph neural network. Knowl.-Based Syst. 2023, 259, 110025. [Google Scholar] [CrossRef]
Zhao, M.; Yang, J.; Zhang, J.; Wang, S. Aggregated graph convolutional networks for aspect-based sentiment classification. Inf. Sci. 2022, 600, 73–93. [Google Scholar] [CrossRef]
Zhou, T.; Law, K.M. Semantic Relatedness Enhanced Graph Network for aspect category sentiment analysis. Expert Syst. Appl. 2022, 195, 116560. [Google Scholar] [CrossRef]
Xu, L.; Pang, X.; Wu, J.; Cai, M.; Peng, J. Learn from structural scope: Improving aspect-level sentiment analysis with hybrid graph convolutional networks. Neurocomputing 2023, 518, 373–383. [Google Scholar] [CrossRef]
Xu, Y.; Yao, E.; Liu, C.; Liu, Q.; Xu, M. A novel ensemble model with two-stage learning for joint dialog act recognition and sentiment classification. Pattern Recognit. Lett. 2023, 165, 77–83. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, Y.; Li, Q.; Tiwari, P.; Wang, B.; Li, Y.; Pandey, H.M.; Zhang, P.; Song, D. CFN: A Complex-Valued Fuzzy Network for Sarcasm Detection in Conversations. IEEE Trans. Fuzzy Syst. 2021, 29, 3696–3710. [Google Scholar] [CrossRef]
Qiu, J.; Ji, W.; Lam, H.-K. A New Design of Fuzzy Affine Model-Based Output Feedback Control for Discrete-Time Nonlinear Systems. IEEE Trans. Fuzzy Syst. 2023, 31, 1434–1444. [Google Scholar] [CrossRef]
Khalifa, T.R.; El-Nagar, A.M.; El-Brawany, M.A.; El-Araby, E.A.G.; El-Bardini, M. A Novel Hammerstein Model for Nonlinear Networked Systems Based on an Interval Type-2 Fuzzy Takagi–Sugeno–Kang System. IEEE Trans. Fuzzy Syst. 2020, 29, 275–285. [Google Scholar] [CrossRef]
Turney, P.D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. arXiv 2002, arXiv:0212032. [Google Scholar]
Wang, H.; Liu, B.; Li, C.; Yang, Y.; Li, T. Learning with noisy labels for sentence-level sentiment classification. arXiv 2019, arXiv:1909.00124. [Google Scholar]

Figure 1. The problems addressed in this article. The English word “anti-hero” is not stored in the sentiment dictionary used by the experiment. Subsequently, the emotional score of “Anti-hero” can’t be loaded by the sentiment dictionary. In other words, the score is empty. On the contrary, the English word “hero” is contained by the dictionary this paper uses. Afterwards, [score(hero) = 0.780822] is retrieved by the dictionary where scores are normalized in advance to between −1 and 1. In the mixed text “为你挑选了实用的礼物, 而你stupid 😫” (Translated version: I select practical gifts for you, but you’re stupid 😫.) with the negative tendency, keywords are “挑选 (select, v., score: 0.290705)”, “实用的 (practical, adj., score: 0.267186)”, “礼物 (gifts, n., score: 0.346128)”, “stupid (adj., score: −0.472602)” and “😫 (emoji, score: −0.354642)”. If the emotional scores are directly summed, the result would be 0.076775, which does not match the label of this sentence. Besides, the emotional fluctuation on the position of “stupid” cannot be directly illustrated.

Figure 2. Embeddings of extraction.

Figure 3. Emotional computation of unknown words. A Word and its Score are extracted as the experiment data uw[cover] and score. In light of Equation (2), the similarity value among the embeddings can be calculated. Through the EORW method, uw(Score) is computed.

Figure 4. The pipeline of sentiment computation of the emoji named tear.

Figure 5. Fluctuation position of three elements.

Figure 6. Centrifugal Navigation-Based Emotion Computation framework for emotional fluctuation.

Figure 7. Emotional fluctuation of English text with an emoji.

Figure 8. The accuracies of emotional computation of 300 and 500 randomly generated words from the emotional dictionary. Different t-values are adopted. The various box plots with distinct colors illustrate different t-values aligned with their respective positions.

Figure 9. The proportion of emojis used in the dataset collected in this paper.

Figure 10. Emotional scores of neighbors of w_i in Kc and neighbors of w_k in Ke. This figure depicts that the bilingual sentence S has four keywords extracted respectively from Kc and Ke. In the Chinese element, The neighbors in NW-Set of select (Original Chinese: 挑选): {挑 (choose), 选购 (optional), 看中 (fancy), 选定 (pick out), 甄选 (pick), 购买 (purchase), 选择 (option), 选取 (select and extract)}; The neighbors in NW-Set of Practical (Original Chinese: 实用的): {实惠 (affordable), 耐用 (durable), 简单 (easy), 方便 (convenient), 有用 (useful), 合算 (be a bargain), 划算 (favorable), 省钱 (economical), 便宜 (cheap)}; The neighbors in NW-Set of Gifts (Original Chinese: 礼物): {礼品 (souvenirs), 送给 (give sb), 见面礼 (a gift such as is usually given to sb. on first meeting him), 生日 (birthday), 贺礼 (congratulatory gift), 贺卡 (congratulation card), 送礼 (give a present), 心意 (compliments or gifts), 过生日 (celebrate a birthday)}.

Figure 11. Emotional scores of neighbors of w_i in Ke. This figure depicts that the English sentence S has four keywords extracted from Ke.

Table 1. Instances of emoji Linking Map LinkMap.

Emoji	Textual Meaning	English Shape in Textual Data
😀	[hah]	:grinning_face:
❤️	[love you]	:red_heart:
😵	[dizzy]	:dizzy_face:
😠	[angry]	:angry_face:
👏	[clap]	:clapping_hands:
😱	[shocked]	:face_screaming_in_fear:
👍	[like]	:thumbs_up:
😊	[smile]	:smiling_face_with_smiling_eyes:

Table 2. Corresponding Parameters between emotional fluctuation and centrifugal motion.

Centrifugal Motion	Sentiment Computation
Centre of circle: O	Other elements: M
An object: m	Fluctuation element: m_f
Radius of the circle: R	The sum of length of M and m_f: length_x + length_y
Trigger of centrifugal motion: θ > 90 degrees	Trigger of emotion fluctuation: Condition of Equation (16)

Table 3. Instances of Dataset.

Label	Text
Negative	为你挑选了实用的礼物, 而你stupid 😫
Negative	I hope we will become good friends 😭
Positive	#积极向上#日子还要继续, 开心点! 😀
Positive	@养熊猫的李桑Love you! 😊

Table 4. Results of emotional computation about words.

Word	Neighbor	Emotion (Ei) and Score i		Score
angry	irate	E1	−0.726027	−0.479452
	enraged	E2	−0.397260
	indignant	E3	−0.424658
	incensed	E4	−0.479452
	annoyed	E5	−0.369863
fire	firing	E1	−0.315068	−0.452055
	alarm	E2	−0.315068
	destroyed	E3	−0.534247
	fumes	E4	0.041096 (discarded)
	fired	E5	−0.643836

Table 5. Emotional computation of emoji ‘loudly crying face’ (retaining six decimal places).

Emoji	emofMeaning	Neighbor	Emotion	S
😭	crying	sobbing	−0.369863	−0.388128
		cried	−0.369863
		screaming	−0.369863
		weeping	−0.452055
		cries	−0.39726
		cry	−0.506849
		moaning	−0.041096
		screamed	−0.287671
		sob	−0.69863

Table 6. The processing of emojis in the bilingual sentence S (retaining six decimal places).

Emoji	emofMeaning	Neighbor	Emotion	S
😫	tired	fatigued	−0.315068	−0.354642
		weary	−0.232877
		bored	−0.232877
		frustrated	−0.589041
		irritated	−0.479452
		annoyed	−0.369863
		impatient	−0.260274
		exhausted	−0.342466
		jaded	−0.369863

Table 7. Emotional scores of keywords in bilingual sentence S (retaining six decimal places).

Keyword	Score
select (Original Chinese: 挑选)	0.290705
Practical (Original Chinese: 实用的)	0.267186
Gifts (Original Chinese: 礼物)	0.346129
stupid	−0.472603

Table 8. Emotional scores of keywords in the English sentence S (retaining six decimal places).

Keyword	Score
hope	0.567732
become	0.488584
good	0.647750
friends	0.710807

Table 9. Results of emotional scores of different elements (retaining six decimal places).

Sentence Type	Label	S_M1	S_M2	S_M3	Fl_P	F_S
Bilingual	negative	0.301340	−0.472603	−0.354642	M2	−0.232910
English	negative	0.603718	−0.388129	Empty	M2	−0.270548

Table 10. Comparable accuracy of different emotional fusion methods.

Average	Maximum-Value	CNEC
0.9756	0.7763	0.9867

Table 11. Comparison of results on dataset with deep learning models.

BERT	RoBERTa	BERT_Promping (Template: How {} it was)	CNEC
0.8106	0.9700	0.8671	0.9867

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, T.; Liu, Z.; Lu, Y.; Zhang, J. Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols. Electronics 2023, 12, 3332. https://doi.org/10.3390/electronics12153332

AMA Style

Yang T, Liu Z, Lu Y, Zhang J. Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols. Electronics. 2023; 12(15):3332. https://doi.org/10.3390/electronics12153332

Chicago/Turabian Style

Yang, Tao, Ziyu Liu, Yu Lu, and Jun Zhang. 2023. "Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols" Electronics 12, no. 15: 3332. https://doi.org/10.3390/electronics12153332

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols

Abstract

1. Introduction

2. Related Work

3. Centrifugal Navigation-Based Emotion Computation Framework of Bilingual Short Texts with Emoji Symbols

3.1. Problem Analysis and Basic Definition

3.2. Emotional Computation of Unknown Words and Emojis

3.2.1. Emotional Computation of Unknown Words

3.2.2. Emotional Computation of Emojis

3.3. Emotional Computation of Bilingual Short Texts with Emojis in Emotional Fluctuation

4. Experiment

4.1. Dataset

4.2. Experiment and Result

5. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI