Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network

Zhao, Qin; Yu, Chenglei; Huang, Jingyi; Lian, Jie; An, Dongdong

doi:10.3390/math12020331

Open AccessArticle

Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network

¹

Shanghai Engineering Research Center of Intelligent Education and Big Data, Shanghai Normal University, Shanghai 201418, China

²

Key Laboratory of Embedded Systems and Service Computing of Ministry of Education, Tongji University, Shanghai 201804, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(2), 331; https://doi.org/10.3390/math12020331

Submission received: 27 December 2023 / Revised: 13 January 2024 / Accepted: 15 January 2024 / Published: 19 January 2024

(This article belongs to the Section Mathematics and Computer Science)

Download

Browse Figures

Versions Notes

Abstract

:

Existing sentiment prediction methods often only classify users’ emotions into a few categories and cannot predict the variation of emotions under different topics. Meanwhile, network embedding methods that consider structural information often assume that links represent positive relationships, ignoring the possibility of negative relationships. To address these challenges, we present an innovative approach in sentiment analysis, focusing on the construction of a denser heterogeneous signed information network from sparse heterogeneous data. We explore the extraction of latent relationships between similar node types, integrating emotional reversal and meta-path similarity for relationship prediction. Our approach uniquely handles user-entity and topic-entity relationships, offering a tailored methodology for diverse entity types within heterogeneous networks. We contribute to a deeper understanding of emotional expressions and interactions in social networks, enhancing sentiment analysis techniques. Experimental results on four publicly available datasets demonstrate the superiority of our proposed model over state-of-the-art approaches.

Keywords:

sentiment analysis; relationship prediction; heterogeneous signed networks; emotional prediction

MSC:

68R05

1. Introduction

The emergence of social media platforms such as Facebook, Twitter, and Weibo has provided novel avenues for emotional exchange, garnering significant attention in sentiment analysis research [1]. This field boasts a multitude of applications, ranging from gauging public opinion during emergencies to improving user recommendation systems and even predicting stock market trends based on social media sentiment propagation. In the realm of sentiment analysis on social media, textual content undergoes processing via natural language processing techniques [2] to ascertain users’ emotional states. A primary challenge lies in identifying latent sentiments embedded within texts [3,4], often resulting in a preponderance of neutral sentiment classifications. Earlier research highlights the scarcity of explicit emotional language in social media texts, underlining the importance of uncovering latent emotional expressions [5,6]. Furthermore, conventional text-based sentiment analysis approaches fall short in detecting non-textual emotional expressions [7].

Sociological theories suggest that emotions are influenced by both personal factors and the environment. Social media platforms enable the formation and maintenance of friendships, which also involve emotional transmission. As human communication progresses, emotions change and spread among users. In recent years, deep learning models [8] have attracted the attention of many scholars who aim to analyze user emotions. Some researchers have tried to incorporate users’ latent relationships, such as attribute features, interactive behaviors [9,10,11], and social connections [12,13,14,15], as additional information to extract users’ hidden emotions. These models use deep learning to represent user emotions inspired by collaborative filtering methods and achieve the prediction of sentiment attitudes on various topics.

Sentiment analysis is the task of identifying and extracting emotions from text. The concept of sentiment analysis was first introduced by Das et al. in 2001, who defined emotions as positive and negative sentiments [16]. However, pure text-based sentiment analysis methods face many challenges in online social networks, where the sentiment polarity is often ambiguous and influenced by various factors such as Contextual Nuances, linguistic Variability, Mixed Sentiments, etc. [17].

Previous research has limitations, as many studies use extra information to identify users’ basic emotions but fail to capture the nuances across different topics. Topics, being the root of user emotions, are key. Not only do topics’ inherent attributes matter, but also their interrelations significantly influence emotions, particularly in political contexts. For instance, a Democrat supporter often shows negative sentiments towards the Republican party on social media. Hence, opposing topics typically evoke contrasting reactions from the same individual.

Meanwhile, network embedding methods that consider structural information often assume that links represent positive relationships, ignoring the possibility of negative relationships. For example, some methods treat all users in the follower list as friends without considering the existence of adversaries. However, research shows [18] that negative links have a much larger impact on network topology and value than positive links. Therefore, it is essential to analyze and quantify the polarity of relationships when exploring latent relationships. Some methods use interaction behaviors to infer relationships, assigning different weights to different behaviors, but they neglect the polarity embedded in some behaviors [19]. For instance, on Weibo, users can repost messages to express approval or disapproval. Using a single interaction to describe these two opposite relationships is not reasonable.

This study, addressing the challenges mentioned above, proposes to formulate the problem of predicting users’ sentiment towards topics as a polarity prediction problem in a heterogeneous signed network. Traditional sentiment analysis techniques often struggle with capturing the dynamics of negative relationships in social networks [20]. This method, however, specifically addresses these challenges by constructing a denser heterogeneous signed information network from sparse data. It leverages emotional reversal and meta-path similarity to predict relationships more accurately. The input is a heterogeneous signed network, and the proposed methods mine different types of relationships between users and topics: (1) User-space friendship relationships, which capture the behavioral patterns of users as indicators of their relationships; (2) Topic relevance, which measures the similarity of topics based on users’ historical preferences. We introduce a sentiment prediction model that considers the joint effect of heterogeneity and polarity, using node type-aware attention layers and semantic path-aware attention layers. It employs two attention mechanisms to fuse the context information hierarchically, obtain the sentiment embeddings of users and topics, and then predict their sentiment towards topics. The remainder of this paper is organized as follows: Section 2 reviews related work. Section 3 elaborates on the proposed model. The experimental setup and results are presented in Section 4. Finally, Section 5 concludes this paper and discusses future work.

2. Related Work

Many studies have incorporated additional information, such as user attributes and user interaction data (e.g., following, retweeting, and interactions), to improve the accuracy of user sentiment analysis [21]. For example, Tan et al. use user behavioral relationships, such as following and mentions, to analyze user sentiments by minimizing label differences among neighbors [22]. Kuo et al.construct a social opinion graph for sentiment analysis of Weibo user groups, combining social interaction information and textual opinions to infer sentiments towards trending topics [23]. Ren et al. model a user’s sentiment attitude towards a topic as a collaborative filtering task, using the follower information for matrix factorization to obtain sentiment analysis results [24]. Inspired by collaborative filtering, Kim et al. measure user sentiments towards topics by calculating user similarity, but their method only considers individual user attributes, ignoring the effect of social relationships among users [25].

Some studies consider sociological theories to analyze user emotions in social networks. For example, Smith et al. perform sentiment clustering based on emotion consistency to obtain user-level sentiments [26]. However, they ignore the impact of user relationships on user sentiment, leading to suboptimal results. Speriosu et al. combine the maximum entropy model with the follower network to obtain training labels and then apply label propagation for sentiment analysis [18]. Eliacik et al. considered user influence and used the PageRank algorithm for sentiment propagation to obtain the final user sentiment [27]. Other studies also consider the social context comprehensively. Nozza et al. model Weibo as a heterogeneous network and inferred both Weibo and user sentiment polarity using a method based on user latent representations [28]. Cheng et al. distinguish user relationships into approval and disapproval and used unsupervised methods for user-level sentiment analysis [29].

On the other hand, some works have explored the use of Agent-Based Modeling (ABM) as an alternative approach to relationship prediction, focusing on the study of human behavior and emotions, such as panic and the dissemination of opinions. For instance, a multi-agent system grounded in extensive linguistic analysis has been developed to address the issues faced by existing models, integrating syntactic, semantic, and subjective analyses to effectively manage the ambiguities and complexities of natural evaluative language [1]. Additionally, an effective modified fuzzy procedure for the dynamic clustering of crowds has been proposed, aiming to determine optimal control parameters for agent-rescuers, such as agent speed, waiting time, and the distribution of agents among crowd clusters [30].

In summary, existing social network sentiment analysis methods have difficulties in extracting the heterogeneous and polarized relationships in social networks. Moreover, they do not pay enough attention to the interactive relationships among social network nodes, which may contain hidden information.

3. Materials and Methods

The current research landscape has witnessed an increasing adoption of deep learning methodologies for node representation and subsequent relationship prediction. However, a predominant focus of these studies has been on social scenarios that primarily involve user interactions. There is a noticeable dearth in the exploration of relationships with other types of entities. Frequently, these studies employ a one-size-fits-all approach to relationship prediction, overlooking the diverse nature of relationships inherent to different entities. For instance, in user-centric networks, relationship predictions often revolve around identifying friendships or antagonisms among users, as seen in user recommendation systems. Conversely, in biological networks like protein–protein interaction networks, the goal is to predict interactions that facilitate or enhance biological activity post-protein binding. This divergence in objectives underscores the necessity for developing relationship prediction methodologies that are bespoke to the unique characteristics of each entity space within these networks.

In addressing the nuances of heterogeneous networks, this section delves into two specific types of entity relationships: user-entity and topic-entity relationships. We propose and elucidate two distinct methodologies for predicting relationships within these entity categories as Figure 1. These methodologies are designed to acknowledge and leverage the unique properties and interaction dynamics of each entity type, providing a more tailored and accurate approach to relationship prediction in heterogeneous network environments.

3.1. Method

Through the above analysis, the main objective of this section is to extract valuable information from the original sparse heterogeneous information network. The goal is to capture potential relationships between nodes of the same type and ultimately construct a denser heterogeneous signed information network. In this section, based on the known input, we design the original heterogeneous information network

G_{o l d} = (V, E)

, which involves two types of nodes and three types of edges. The node types include user type u and topic type v, denoted as

Γ_{V} = u, v

. The edge types consist of three categories

Γ_{E} = p, s^{+}, s^{-}

, where one type is the unsigned social relationship p between nodes of the user type, and the other two types are positive sentiment relationship

s^{+}

and negative sentiment relationship

s^{-}

between nodes of the user type and nodes of the topic type, representing user support or opposition to the given topic.

For user relationship prediction, integrating emotional reversal feature

S

, social relationship feature

J

, individual characteristic feature

O_{u}

, and individual activity feature

A_{u}

, we perform a fused calculation to obtain the friendship or enmity relationship P between users. This is represented by the function

F_{u s e r}

, formulated as follows:

P (u_{i}, u_{j}) = F_{u s e r} (S, J, O_{u}, A_{u}) .

(1)

For the prediction of topic relationships, a fused calculation is performed by integrating topic characteristics

O_{v}

and meta-path similarity

H

. This process results in the determination of the correlation relationship Q between topics, denoted by the function

F_{t o p i c}

:

Q (v_{i}, v_{j}) = F_{t o p i c} (O_{v}, H) .

(2)

This section ultimately aims to obtain the heterogeneous signed information network

G_{n e w} = (V, E')

, which involves two types of nodes and six types of edges. The node types

Γ_{V}^{'} = Γ_{V} = u, v

. The edge types consist of six categories

Γ_{E}^{'} = p^{+}, p^{-}, s^{+}, s^{-}, q^{+}, q^{-}

, where

P = p^{+}, p^{-}

represents the friendship/enmity relationships between nodes of the user type,

S = s^{+}, s^{-}

represents the support relationships between nodes of the user type and nodes of the topic type, and

Q = q^{+}, q^{-}

represents the correlation relationships (coupling/competitive) between nodes of the topic type.

3.2. User Relationship Prediction Based on Emotional Reversal

This subsection primarily focuses on the prediction of signed relationships among users. Extracting nodes of the user type and edges representing unsigned social relationships, such as

E p : ϕ_{E} (ϵ) = P

, from the original heterogeneous information network

G_{o l d} = (V, E)

, forms the raw user relationship network. In this context, unsigned social relationships encompass non-textual social connections, such as following relationships. Using textual information generated by users in interactive scenarios as input, the aim is to unearth potential friendship or enmity relationships among users, constructing a signed network of user relationships. The feature of emotional reversal among users is reflected by the probability matrix of reversal between users, considering the differences in behavior under various interactive scenarios based on textual information generated by users. User social relationship features are determined by calculating the similarity among neighbors in the unsigned social network of users. Meanwhile, individual characteristics and activity levels of users are determined by their attributes. Detailed explanations of the specific computation methods will be provided in subsequent sections.

3.3. Emotional Reversal Theory

The texts generated through retweets and comments, which differ from regular text publications, are to some extent influenced by the texts they are derived from. Consequently, users express their emotions on a given topic based on this foundation, demonstrating that other users can influence emotions to a certain degree and, in some cases, even reverse the emotions of users. The concept of emotional reversal was first introduced by Wang et al., defining it as the phenomenon where texts (retweeted) and their retweets (texts generated after retweeting) exhibit different emotional polarities within the cascade tree of retweets. As illustrated in Figure 2, within the same cascade tree, where

m_{1}

serves as the root node generating four retweets,

m_{2}

, after being retweeted, generates

m_{3}

, and the emotional polarities of the two differ.

m_{2}

is positive, while

m_{3}

is negative. Therefore, an emotional reversal occurs between

m_{2}

and

m_{3}

. Similarly,

m_{4}

and

m_{5}

do not experience an emotional reversal.

The text, to some extent, represents the author’s emotional expression on a particular topic. Therefore, instances of emotional reversal in the text can be extended to emotional reversal between authors. We further expand this scenario to other behavioral networks, as illustrated in Figure 2. Positive emotional text (

m^{u_{1}}

) posted by user

u_{1}

is interactively associated with multiple users through actions such as retweets and comments. Each interaction generates respective texts with different emotional polarities.

In the retweet interaction chain, when user

u_{2}

retweets the text (

m^{u_{1}}

) and generates the text (

m^{u_{2}}

) with a negative polarity, an emotional reversal occurs between users

u_{1}

and

u_{2}

. Additionally, comments may also produce texts with different emotional polarities. For example, a user (

u_{4}

) generating a negative comment experiences an emotional reversal.

Therefore, we redefine emotional reversal, stating that in the user interaction network, if a text and the text generated by interacting with it have different emotional polarities, it indicates an emotional reversal between the two authors.

3.4. Calculating the Probability of Emotional Reversal for Users

We define various interactive networks between texts as

G_{B} = (M, R)

, where the set of edge objects represents the association between texts through user interactions. The node type is text, and the interactive behaviors include retweeting and reviewing, denoted as

B \to Γ_{B} = {r e t w e e t, r e v i e w}

. Each user-type node object corresponds to a set of texts it produces. For example, the text set of user

u_{i}

, denoted as

M_{u_{i}} = {m_{1}^{u_{i}}, m_{2}^{u_{i}}, . . ., m_{L}^{u_{i}}}

, where each text

m_{i}

undergoes sentiment classification using the sentiment analysis method from the previous section, resulting in sentiment polarity

s_{m} \in {- 1, 1}

, representing negative and positive sentiments, respectively.

Thus, the behavioral network

G_{B}

is transformed into two sentiment reversal networks for behaviors, denoted as

G_{s e n t i - r v s}^{B} = (M, K)

, where

K = {k_{m n}^{B} | m \in M, n \in M}

represents whether sentiment reversal occurs between text m and text n under behavior

B

.

k_{m n}^{B} = 1

, indicating sentiment reversal between text m and text n, i.e.,

s_{m} \neq s_{n}

.

- 1

, denoting no sentiment reversal between text m and text n, i.e.,

s_{m} = s_{n}

. 0, representing no interaction relationship (

B

) between text m and text n.

According to the above definition, the count of sentiment reversals between users under a single behavior

B

is statistically measured. The specific formula is as follows:

c_{u_{i}, u_{j}}^{B} = \sum_{m \in M_{u_{i}}, n \in M_{u_{j}}, k_{m n}^{b} = 1} k_{m n}^{B},

(3)

where m represents a certain text in the text set

u_{i}

of user

M_{u_{i}}

, and n represents a certain text in the text set n of user

u_{j}

.

It is necessary to count the number of times users participate in behavior b to quantify the proportion of user involvement in this behavior.

n_{u_{i}, u_{j}}^{B} = \sum_{m \in M_{u_{i}}, n \in M_{u_{j}}} {| k}_{m n}^{B} | .

(4)

Based on the aforementioned statistics of texts between two users, the reversal probability between user

μ_{j}

and user

μ_{i}

under behavior

B

is obtained:

p_{B} (u_{i}, u_{j}) = \frac{c_{u_{i}, u_{j}}^{B}}{n_{u_{i}, u_{j}}^{B}} .

(5)

For each type of behavior, calculate the reversal probability between each pair of interacting users and compute the reversal probability for users

μ_{i}

and

μ_{j}

across all interaction behaviors. The specific implementation process for calculating the reversal probability between users for different behaviors is presented in the Algorithm 1.

The time complexity of this algorithm is calculated as

O (k | E |)

, where

| E |

is the number of edges in the interaction network

B

, and k is the time taken to determine if each text results in a reversal. Before computing the reversal probability between two users, sentiment polarity determination is required for each text, leading to a time complexity of

O (c n)

. Therefore, the overall time complexity of the algorithm is

O (k | E | + c n)

, falling within the polynomial time category.

Considering that the degree of sentiment reversal may vary under different behaviors, we incorporate different weights during the fusion of reversal probabilities among users engaged in different behaviors. The specific formula for calculating the sentiment reversal feature

S

is as follows:

S (u_{i}, u_{j}) = \sum_{b \in B} α_{b} p_{b} (u_{i}, u_{j}) .

(6)

In the equation,

α_{b}

represents the weight assigned to behavior b, and

p_{b} (u_{i}, u_{j})

denotes the sentiment reversal probability under behavior b.

3.5. Sentiment Reversal Feature and User Relationships

Wang et al. concluded through experiments that an emotional reversal is more likely to occur between users without a friend relationship [31]. Based on this, the hypothesis is made: users with a strong emotional reversal have hostile relationships, while users with emotional consistency have friendly relationships. Additionally, the degree to which an emotional reversal reflects friend or foe relationships varies across different behaviors.

Algorithm 1 Calculate the probability of reversal between users

Require:: Graph $G_{t e x t} (M, R)$ of the text behavior network, text $m, n (m \in M_{u_{i}}, n \in M_{u_{j}})$ , each edge r has mapping function $ϕ_{B} : B \to Γ_{B} = {r e t w e e t, r e v i e w}$ , collection of text for each user $M_{u_{i}} = {m_{1}^{u_{i}}, m_{2}^{u_{i}}, . . ., m_{L}^{u_{i}}}$ .
Ensure:: Probability of reversal between $u_{i}$ and $u_{j}$ to build Graph $G_{s e n t i - r v s}^{B} = (M, K)$ in behavior $B$ and the probability of reversal in total.
1:: for $m = 1$ to $| M |$ do
2:: for $n = 1$ to $| M | - m$ do
3:: if $ϕ_{B} (r_{m n}) = r e t w e e t$ then
4:: Calculate $k_{m n}^{r e t w e e t} (= k_{n m}^{r e t w e e t})$ .
5:: else
6:: Calculate $k_{m n}^{r e v i e w} (= k_{n m}^{r e v i e w})$ .
7:: end if
8:: if $k_{m n}^{r e t w e e t} = 1$ then
9:: $n_{u_{i}, u_{j}}^{r e t w e e t} + +$ .
10:: end if
11:: if $k_{m n}^{r e v i e w} = 1$ then
12:: $n_{u_{i}, u_{j}}^{r e v i e w} + +$ .
13:: end if
14:: Calculate $n_{u_{i}, u_{j}}^{r e t w e e t} = n_{u_{i}, u_{j}}^{r e t w e e t} + | k_{m n}^{r e t w e e t} |$ .
15:: Calculate $n_{u_{i}, u_{j}}^{r e v i e w} = n_{u_{i}, u_{j}}^{r e v i e w} + | k_{m n}^{r e v i e w} |$ .
16:: end for
17:: end for
18:: Calculate $n_{u_{i}, u_{j}} = n_{u_{i}, u_{j}}^{r e t w e e t} + n_{u_{i}, u_{j}}^{r e v i e w}$ , $n_{u_{i}, u_{j}} = n_{u_{i}, u_{j}}^{r e t w e e t} + n_{u_{i}, u_{j}}^{r e v i e w}$ .
19:: Calculate $p_{r e t w e e t} (u_{i}, u_{j}) = \frac{n_{u_{i}, u_{j}}^{r e t w e e t}}{n_{u_{i}, u_{j}}^{r e t w e e t}}$ .
20:: Calculate $p_{r e v i e w} (u_{i}, u_{j}) = \frac{n_{u_{i}, u_{j}}^{r e v i e w}}{n_{u_{i}, u_{j}}^{r e v i e w}}$ .
21:: Calculate $p (u_{i}, u_{j}) = \frac{n_{u_{i}, u_{j}}}{n_{u_{i}, u_{j}}}$ .

3.6. Unsigned Social Relationship Feature

Due to the main focus on considering the contextual factors to explore the underlying emotions of users towards topics, it is necessary to address the issues of the social environment and the reachability of the target user. Methods based on attribute similarity tend to associate users with similar attributes but no direct connection, forming edges between them. However, this approach based on attribute similarity is not suitable for solving the problem. Therefore, the paper emphasizes the consideration of structural similarity.

Methods based on structural similarity indicate that two users within similar network structures are likely to be similar. For instance, users within the same community are more likely to form positive edges. In the approach based on structural similarity, the paper focuses on local information factors, i.e., the influence of surrounding neighbors on the target node.

From the original heterogeneous information network

G_{o l d} = (V, E)

, nodes are extracted as user types, and edge types are unsigned social relationships, denoted as

E_{p} : ϕ_{E} (ε) = p

. Here, unsigned social relationships include non-textual social interactions such as following relationships. Considering the impact of the power-law distribution discussed in the previous section, user relationships with behavior counts less than 3 are also categorized as unsigned social relationships, thereby mitigating the sparsity issue of user relationships to some extent.

We borrow the Jaccard similarity definition and propose a method for calculating similarity based on overlapping neighbors. This method is used to calculate the similarity between connected user pairs. The specific formula is as follows:

J (u_{i}, u_{j}) = \frac{{| N}_{u_{i}} {\cap N}_{u_{j}} |}{{| N}_{u_{i}} \cup N_{u_{j}} | + 1}

(7)

where

N_{u_{i}}

represents the neighbor set of user

u_{i}

. To address the case where the neighbor set of an independent node is empty, and to prevent division by zero in the denominator, we add 1 to the denominator.

Similarly, considering the issue of reachability, we use this method to calculate the similarity only between connected user pairs. Therefore, the time complexity is

O (k | E |)

, where k represents the time to calculate user similarity, and

| E |

represents the number of existing unsigned connections. It is assumed that dissimilar users may not necessarily have a competitive relationship; it could be an indifferent attitude. On the other hand, similar users are likely to be friends. Therefore, the method based on unsigned social relationships is used specifically for exploring positive relationships.

3.7. Fusion of Multi-Feature Relationship Prediction

In addition to thoroughly exploring interactions among users, we also consider the issue of inherent individual characteristics, such as personal activity level (

A_{u}

), and personal traits (

O_{u}

).

The personal activity level (

A_{u}

) refers to the degree of a user’s involvement in topic discussions. The higher the user’s participation in topic discussions, the more accurately their behavior reflects emotional sentiments. The user’s activity level is specifically manifested as the frequency of their engagement in activities related to topics. The calculation method is defined by Formula:

A_{u} = \sum_{u^{'} {\in N}_{u}} n_{u, u^{'}}

(8)

where

N_{u}

is the set of neighbors for user u, and

n_{u, u^{'}}

represents the total number of activities between user u and their neighbor

u^{'}

.

Personal characteristics, denoted as

O_{u}

, refer to the user’s inherent tendencies expressed through their actions. For example, there exists a group of users who consistently exhibit sentiment reversals in most of their activities, refraining from expressing sentiment for texts they agree with or support and vice versa. This characteristic is represented by the average reversal occurrence across all user activities. The calculation is given by the formula:

O_{u} = \frac{1}{| N_{u} |} \sum_{u^{'} {\in N}_{u}} p (u, u^{'})

(9)

where

p (u, u^{'})

represents the total number of sentiment reversal occurrences between user u and neighboring user

u^{'}

.

For user relationship prediction, a classification is performed considering sentiment reversal

S

, social relationship

J

, personal characteristics

A_{u}

, and personal activity level

O_{u}

. The goal is to classify the user interaction as either a friendly relationship

p^{+}

or an adversarial relationship

p^{-}

.

The overall implementation of the algorithm follows the outlined approach. Finally, logistic regression is employed to fuse all extracted relevant features, predicting potential signed relationships between users: positive relationships

p^{+}

and negative relationships

p^{-}

.

Topic Relationship Prediction Based on Meta-Path Similarity

This subsection focuses on topic sign relationship prediction. From the original heterogeneous information network

G_{o l d} = (V, E)

, we extract edges of the type representing historical sentiment relationships between users and topics, i.e.,

E_{s^{+}} : ϕ_{E} (ε) = s^{+}

and

E_{s^{-}} : ϕ_{E} (ε) = s^{-}

. This forms the original input network. By constructing meta-paths, we mine rich semantic information in the user sentiment network, obtaining the coupling and competition between topics. This leads to the construction of a sign network representing relationships between topics.

Non-controversial Topic Mining: This section analyzes the possible correlations between topics based on the main objectives of the paper, namely, coupling and competition arising from user behavior. Topics with coupling tend to receive similar sentiment attitudes from the same group of users, while topics with competition usually encounter opposing sentiment attitudes. Exploring such relationships serves as contextual factors aiding in the discovery of users’ unknown attitudes towards topics in subsequent tasks. Many methods assume that tasks involving mining node relationships require simultaneous consideration of two nodes, neglecting the inherent properties of the nodes. In this context, user emotional states may depend on the nature of the considered topics. Naskar et al. [32], experimenting with topics related to various terrorist attacks, such as the Syrian terrorist attacks, indicated that users maintain highly negative sentiment attitudes towards topics related to terrorist attacks. The sentiment evolution of such topics deviates from the average level of general topics. Therefore, it is crucial to consider such topics, which have special properties, separately.

Due to the inherent nature of topics leading to user tendencies in sentiment, topics for which users tend to exhibit consistent attitudes are considered non-controversial topics. Non-controversial topics include strongly positive and strongly negative topics. Strongly positive topics refer to topics for which users participating in discussions generally maintain a positive attitude, such as HappyNationalDay and WinterOlympicsSmoothOpening. On the other hand, strongly negative topics refer to topics for which users participating in discussions generally maintain a negative attitude, such as TerroristAttack and EasternAirlines MU5735 Crash.

To avoid the problem of a small number of laws caused by sample sparsity, we define topics with more than 10 users participating in discussions in historical sentiment data as candidate topics. The set of non-controversial topics is mined by calculating the information entropy of candidate topics. From the original heterogeneous information network

G o l d = (V, E)

, edges of the types representing positive and negative sentiment relationships, i.e.,

E s^{+} : ϕ_{E} (ε) = s^{+}

and

E s^{-} : ϕ E (ε) = s^{-}

, can be extracted. Information entropy is utilized to measure the diversity of user attitudes towards each topic. Information entropy is a method used to measure the degree to which the categories in a dataset tend to be consistent. Larger information entropy indicates more balanced user attitudes, while smaller information entropy indicates that the topic has strong special properties leading to user tendencies in sentiment. The formula for calculating the information entropy

H_{t_{i}}

for topic

t_{i}

is shown as follows:

H_{t_{i}} = - \sum_{x \in s^{+}, s^{-}} p_{t_{i}} (x) l o g p_{t_{i}} (x)

(10)

where

p_{t_{i}} (x)

represents the proportion of users with positive or negative sentiment in all users participating in topic

t_{i}

, and

H_{t_{i}}

is the information entropy of topic

t_{i}

ranging from 0 to 1. Topics with information entropy less than

0.4

are considered non-controversial, and the sentiment polarity of these topics is determined. Topics with different sentiment polarities are competitive, while topics with the same sentiment polarity are coupled, determining the correlation between non-controversial topics.

Heterogeneous Signed Information Network is represented as follows:

G_{n e w} = (V, E^{'})

.

Node types are defined as

Γ_{V}^{'} = {u, v}

, where u represents nodes of user type and v represents nodes of topic type. Their initial node embeddings are represented in one-hot encoding based on their respective attribute features. We use Figure 3 to illustrate the various relationships and their corresponding symbolic representations.

Edge types consist of six relationships in three semantic spaces

Γ_{E}^{'} = {P, S, Q}

. In the user–user relationship space

P = {p^{+}, p^{-}}

, where

p^{+}

represents friendship relationships and

p^{-}

represents antagonistic relationships between user-type nodes u. In the user-topic relationship space

S = {s^{+}, s^{-}}

, where

s^{+}

represents positive sentiment links (indicating user u supports topic v) and

s^{-}

represents negative sentiment links (indicating user u opposes topic v). In the topic–topic relationship space

Q = {q^{+}, q^{-}}

, where

q^{-}

represents competitive relationships and

q^{+}

represents coupling relationships between topic-type nodes v.

Since each relationship corresponds to fixed types of nodes, the adjacency matrix of this heterogeneous symbolic network can be represented as:

A = a_{i j} \in \{\begin{matrix} {P, 0} & if i \in u, j \in u \\ {S, 0} & if i \in u, j \in v \\ {Q, 0} & if i \in v, j \in v \\ 0 & otherwise \end{matrix}

(11)

where

a_{i j} = 0

indicates that the relationship between those nodes is unknown.

Mining Relationships Between Controversial Topics. Most existing methods primarily consider the direction from the topic’s attributes, utilizing clustering methods based on feature similarity to find similar topics and ultimately obtaining potential signed relationships between topics. The drawbacks of this method are: first, it only focuses on positive relationships between topics, overlooking the existence of negative relationships; second, there might be similar but negative relationships that cannot be determined solely by attribute similarity.

The coupling and competition of topics are determined by the user’s attitude, making the analysis based on the user’s historical sentiment data reasonable. It is important to note the corresponding user and the signed polarity of their sentiment links. Path-based methods preserve node feature information along the paths and retain different semantic relationships based on different path patterns. Path-based methods are often used for semantic extraction between nodes in heterogeneous networks. Additionally, considering all topics requires calculating the similarity between each pair of topics, involving expensive matrix multiplication operations leading to increased time complexity. Therefore, we set requirements for the candidate topic neighbors to prune the matrix multiplication. In the process of extracting path instances, the second-order reachable neighbors for each topic i with a reachable path count greater than 5 are selected as candidate neighbors.

In the defined heterogeneous information network

G_{o l d} = (V, E)

, nodes involve user type u and topic type v, and edge types include three types

Γ_{E} = p, s^{+}, s^{-}

, where one is the unsigned social relationship between user type nodes

p

. Since links have different signed semantics, four types of meta-path patterns are defined, including:

Meta-path Pattern One: The same user expresses a positive attitude towards two topics

T_{i}

and

T_{j}

.

T_{i} \overset{{(s^{+})}^{- 1}}{⟶} U \overset{s^{+}}{⟶} T_{j}

(12)

Meta-path Pattern Two: The same user expresses a negative attitude towards two topics

T_{i}

and

T_{j}

.

T_{i} \overset{{(s^{-})}^{- 1}}{⟶} U \overset{s^{-}}{⟶} T_{j}

(13)

Meta-path Pattern Three: The same user expresses a positive attitude towards the starting topic

T_{i}

and a negative attitude towards the ending topic

T_{j}

.

T_{i} \overset{{(s^{+})}^{- 1}}{⟶} U \overset{s^{+}}{⟶} T_{j}

(14)

Meta-path Pattern Four: The same user expresses a negative attitude towards the starting topic

T_{i}

and a positive attitude towards the ending topic

T_{j}

.

T_{i} \overset{{(s^{-})}^{- 1}}{⟶} U \overset{s^{-}}{⟶} T_{j}

(15)

The four meta-path patterns can be summarized into two types of paths: symmetric meta-paths and asymmetric meta-paths. Symmetric meta-paths refer to instances where the same user expresses similar sentiments towards two different topics, indicating a coupling relationship between topics. Asymmetric meta-paths, on the other hand, refer to instances where the same user expresses opposite sentiments towards two different topics, indicating a competitive relationship between topics. The similarity between meta-paths is calculated based on these two path types:

Calculation of Topic Coupling based on Symmetric Meta-Paths: In the context of meta-path type

P 1

, we count the total number of paths from topic i to topic j, denoted as

p_{i \to j}

, and the total number of paths from topic i and topic j to themselves, denoted as

p_{i \to i}

and

p_{j \to j}

respectively, under the conforming meta-path pattern

P 1

. These values represent the total reachable paths for topics i and j to themselves in the path pattern

P 1

.

P a t h S i m (i, j) = \frac{2 * |p_{i \to j} : p_{i \to j} \in P_{1}|}{|p_{i \to i} : p_{i \to i} \in P_{1}| + |p_{j \to j} : p_{j \to j} \in P_{1}|}

(16)

Calculation of Topic Competition Based on Asymmetric Meta-paths: Considering that the linkages in meta-path type

P 2

have different semantics, the method based on PathSim is not applicable. Therefore, the HeteSim method is considered. Assuming the encounter probability of the two end nodes under path P:

P M_{P} = U_{A_{1} A_{2}} U_{A_{2} A_{3}} \dots U_{A_{m - 1} A_{m}} U_{A_{m} A_{m + 1}} \dots U_{A_{l - 1} A_{l}} (P = A_{1} A_{2} A_{3} \dots A_{m - 1} A_{m} A_{m + 1} \dots A_{l}),

(17)

P M_{P} = P M_{P_{L}} (A_{1}) P M_{P_{R}} (A_{l}),

(18)

where

P M_{P}

represents the product of the reachable probability matrices on the left and right sides of path pattern

P 2

, with the midpoint type M as the boundary.

U_{i j}

denotes the adjacency matrix of topics i and j, normalized along the row direction.

After calculating

P M_{P}

, normalization is performed:

H e t e S i m (i, j) = \frac{{P M}_{P L} (i) \cdot {P M}_{P R}^{'}^{- 1} (j)}{\sqrt{| | {P M}_{P L} (i) | | \cdot {| | {P M}_{P R}^{'}}^{- 1} (j) | |}}

(19)

Topic i first checks whether the second-order neighbor topic j satisfies the criteria of candidate topic neighbors. If topic j qualifies as a candidate topic neighbor, then the coupling degree and competitive degree for topics i and j are calculated using the two methods mentioned above and are combined to obtain the meta-path similarity

H

:

H {(T}_{i}, T_{j}) = F (P a t h S i m {(T}_{i}, T_{j}), H e t e S i m (T_{i}, T_{j}))

(20)

H {(T}_{i}, T_{j}) = β_{H} H e t e S i m (T_{i}, T_{j}) - (1 - β_{H}) P a t h S i m {(T}_{i}, T_{j})

(21)

Feature fusion: For topic relationship prediction, the comprehensive classification of the relationship Q between topics is based on the fusion of topic characteristics

O_{v}

and metapath similarity

H

.

Q (v_{i}, v_{j}) = {β_{1} O}_{i} ⊙ O_{j} + β_{2} H {(T}_{i}, T_{j})

(22)

where ⊙ is the exclusive OR (XOR) operator, yielding one for identical elements and zero for different ones. Finally, logistic regression is employed to fuse and classify all relevant features, completing the prediction of potential signed relationships between topics: coupling relation

q^{+}

and competitive relation

q^{-}

.

4. Experiments

4.1. Baselines

We compared our method with network structure-based methods:

Node2vec [33]: Utilizes the principle of graph random walks, employing second-order random walks and skip-gram to learn node embeddings;
SDNE [31]: Employs an autoencoder to capture both local and global structures of the target network. Local and global structures consider first-order and second-order similarities of nodes, respectively;
MF [34]: Based on the essence of matrix factorization. It decomposes the adjacency matrix of a network into two low-rank matrices to learn node features within the given network. The low-rank matrices are then corrected by reconstructing the adjacency matrix;
LP [32]: Applies the Local Path index to a signed network, considering third-order path counts between nodes on top of the second-order paths. It serves as a similarity metric based on global information.

4.2. Dataset

To validate the effectiveness of our proposed model, we conducted experiments on four public datasets:

https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter, accessed on 2 March 2023. Customer Support on Twitter: A million-level dataset consisting of 2,811,774 tweets and replies from large enterprises and customers. A total of 702,777 users participated in the comments, with 375,460 being negative and 16,419,719 being positive;
Weibo Dataset [35]: The dataset includes partial user relationships, user text content, and forwarding links between texts, totaling 63,641 user profiles and 84,168 text entries. Among them, there are 1,048,575 user relationships and 27,759 forwarding relationships. It can be observed that a small number of users engage in hundreds of forwarding and text publishing behaviors, while the majority of users exhibit lower levels of activity;
Wiki Dataset [36]: The Wiki dataset consists of data from Wikipedia administrator elections, encompassing two user levels and their voting activities. The voting scenarios involve 3 categories: 1, −1, and 0, representing support, opposition, and neutrality, respectively. The dataset spans nearly 2800 elections, with 104,167 votes cast, involving 7126 users participating in the elections;
Slashdot Dataset [37]: The Slashdot dataset comprises user data from the technology news commenting website Slashdot. It captures a signed network formed by friendships and enmities among users on the website, totaling 82,144 nodes and 549,202 relationships.

Negative Relationships. In a dataset of 2,017,439 Twitter comments, a total of 312,655 tweets experienced sentiment reversal, while 1,704,784 comment texts maintained consistency with the sentiment of the commented texts. According to the principle of sentiment consistency, the tweets were associated with their authors. In cases where users posted multiple tweets, the probability of sentiment reversal was calculated. Pairs of users with a sentiment reversal probability greater than 0.5 were considered to have experienced sentiment reversal, resulting in a total of 43,100 pairs of users engaging in commenting interactions.

In 27,759 cases of retweet relationships in Weibo data, a total of 19,226 retweets did not experience sentiment reversal, while 8533 retweets underwent sentiment reversal. In 25,552 cases of users engaging in retweet interactions, pairs of users with a sentiment reversal probability greater than 0.5 were considered to have experienced sentiment reversal. There were a total of 17,773 cases where sentiment reversal did not occur between users, while 7779 cases involved sentiment reversal between users.

From these two datasets, it can be observed that the distribution of sentiment reversals between texts and their respective authors is relatively consistent. This reflects that in the online world, most interactions exhibit supportive attitudes, but there is still a portion of interactions that result in sentiment reversals. Therefore, it is essential to recognize that interactions between users are not solely positive relationships, and it is necessary to investigate and explore interactions that involve negative relationships.

Analysis of Interaction Volume between Users. Statistics were conducted on the interaction between users, and the data in Table 1 represents the number of pairs of users with the respective interaction counts.

It can be observed that the majority of interactions between users are small, concentrated in the range of 1–4 times. However, there are also pairs of users with a large number of interactions. In the Twitter dataset, the highest interaction count between users is 37, while in the Weibo dataset, it reaches as high as 59 retweet interactions. Based on practical considerations and theoretical insights, it is evident that the more interactions, the more likely they reflect the relationship between users. Conversely, fewer interactions may be incidental, introducing significant noise and bias. The calculated reversal probability fluctuates greatly, making it challenging to use reversal probability to judge relationships between users. Therefore, in the subsequent analysis, pairs of users with fewer than 3 retweet interactions are temporarily not considered.

Distribution of User Reversal Data and User Follow Data. The distribution of relationship situations among users in the Weibo Dataset Figure 4 is described, including those with social relationships and those without an emotional reversal. As shown in the figure, users with interactions do not necessarily have a following relationship, and more interactions occur between users who are not following each other. This suggests that exploring the interaction relationships between users can, to some extent, address the issue of sparsity. Moreover, during interactions, users are highly likely to maintain emotional consistency, meaning that emotional reversal does not occur. However, in nearly a quarter of cases, an emotional reversal does occur, further confirming that treating interaction relationships solely as positive relationships is simplistic. Neglecting the impact of negative relationships may lead to biases in the final analysis.

Emotional reversal situations under user-following relationships. When there is an interaction relationship between users, the distribution of the number of user pairs with no emotional reversal and the number of user pairs with emotional reversal is shown in Figure 5.

From Figure 5, it can be observed that users with relationship edges are very likely not to experience an emotional reversal, indicating a clear connection between emotional reversal and user relationships. In addition, user pairs with interaction counts of 2 and 3 may have a certain bias due to the small number of interactions, resulting in a relatively lower proportion of no reversal.

Considering the method of calculating correlation coefficients between discrete variables, we obtain the correlation coefficient between the variables of emotional reversal probability and follower relationships by calculating information gain. We calculate the correlation coefficient based on information gained for different interaction volumes, taking into account the influence of interaction volume factors. The specific results are shown in Figure 6.

By calculating the correlation coefficient, it can be inferred that there is a correlation between emotional reversal probability and the attention relationship. Moreover, from the graph, it can be observed that in the Weibo dataset, except for user pairs with interaction volumes of 7 and 10, the correlation coefficient steadily increases with the increase in interaction volume. The reason for this could be compared with Figure 5, where user pairs with interaction volumes of 7 and 10 have smaller data volumes, possibly due to insufficient data leading to anomalies. In the Twitter dataset, the correlation is relatively high for user pairs with large interaction volumes. Therefore, an emotional reversal can be used to judge user relationships when there is a certain level of interaction between users.

The Weibo repost dataset consists of Chinese text, while the Twitter customer support dataset contains English text. In this paper, we utilize third-party libraries, SnowNLP and Textblob, for sentiment analysis of Chinese and English texts, respectively, to categorize the sentiments of all texts. Given the unique linguistic context of Weibo, we pre-train the sentiment classification model in SnowNLP, incorporating the HowNet Weibo sentiment dataset to enhance the text sentiment analysis capability. Considering the potential errors in text sentiment classification, this study identifies a sentiment reversal between texts when the difference in sentiment scores exceeds 0.6 (with the sentiment score ranging from 0 to 1). Based on these criteria, we calculate the probability of sentiment reversal among authors. In the Weibo repost dataset, instances where the authors in the text repost network cannot be identified are excluded by deleting such repost links. This process results in a behavioral interaction network between texts, characterized by fluent and clear logic in English translation.

4.3. Result

4.3.1. Prediction of User Relationships and Topic Associations

Due to the consideration of interactions between users, the comparative experiments of the HMSN (Heterogenous Multi-relation Signed Network) model were conducted only on the Weibo repost dataset. The experimental results are shown in Table 2. The topic prediction experiments were conducted using the Wiki dataset and the Slashdot dataset. The experimental results are presented in the Table 3. In the experimental results table, bold indicates the best performance, and underlined indicates the second-best performance.

The experimental results indicate that the proposed user relationship prediction model, HMSN, based on sentiment reversal, achieves relatively good experimental performance. Analysis of its principles reveals several points:

Node2Vec, SDNE, and LP simultaneously consider higher-order relationships between nodes. Building on this, SDNE integrates both relationships using an autoencoder, resulting in better experimental performance than Node2Vec. LP, considering third-order neighbor relationships, outperforms the former two. Thus, it is necessary to consider the mutual influence between features. Node2Vec and SDNE, in their exploration of neighbors, do not take into account the relationships along the paths. Consequently, they overlook the impact of sign factors. Therefore, in homogeneous networks, both may yield better experimental results. However, in signed networks, they fail to achieve satisfactory experimental outcomes;
Matrix factorization achieves the best experimental performance among the baseline models. This might be because the model simultaneously considers node features and structural relationships, indicating that node features play a role in improving experimental performance. The features explored, such as individual activity level and personal characteristics, represent personalization, demonstrating the effectiveness of attribute-based relationship mining;
The Matrix Factorization (MF) method incorporates sign factors during matrix construction, and likewise, Label Propagation (LP) also considers sign factors. As a result, their experimental performance is relatively superior among baseline models;
We proposed the method that is based on similarity calculations under given meta-paths. It considers higher-order neighbor relationships when setting meta-paths, thereby achieving certain improvements. Furthermore, by incorporating node characteristics, the model further enhances its experimental effectiveness. The method is relatively superior to other baseline models. This may be attributed to the consideration of both structural features and personal attribute features. Additionally, the model takes into account the user relationships manifested during interactions in the structural feature aspect, showing that additional information is beneficial for relationship mining.

4.3.2. Signed Network Embedding Models (Node Classification Task)

Experiments on signed network embedding models were conducted using the Wiki and Slashdot datasets, with the experimental results in node classification tasks presented in Table 4. It is observed that HMSN demonstrates the best performance on both datasets. Among the baseline models, SDGNN performed the best, likely due to its consideration of status theory on top of SiGAT as an additional task, thereby enhancing its performance. The superior results achieved by the model presented in this paper may be attributed to the fact that other signed network embedding methods only considered first-order neighbors, while HMSN aggregated higher-order neighbors. This also indicates that higher-order neighbors have a significant impact on the target users.

4.3.3. Heterogeneous Signed Network Model (Link Prediction Task)

This section compares the experimental results of the heterogeneous signed network model on the Wiki dataset. As observed from Table 5, HMSN outperforms the other heterogeneous signed network models. Among them, SiHet shows lesser efficacy compared to other heterogeneous network models. An exploration into the principles of SiHet reveals that it overlooks the issue of node heterogeneity in its research process, which may be the reason for its suboptimal performance. Meanwhile, NESA, employing an encoder approach, appears to surpass baseline heterogeneous network embedding models overall.

Furthermore, when combining the experimental outcomes in both heterogeneous and signed networks, it can be concluded that the model proposed in this article is also applicable to both heterogeneous and signed networks.

4.3.4. Ablation Study

The HMSN model utilizes four user features: Sentiment Reversal Feature

S

, Social Relationship Feature

J

, Individual Characteristic Feature

O_{u}

, and Individual Activity Level Feature

A_{u}

. Each of these features is successively excluded, denoted as

S w / o

,

J w / o

,

O_{u} w / o

, and

A_{u} w / o

. The relationship mining model is implemented using the remaining three features. The experimental results are shown in the figure.

It can be observed that the HMSN model constructed with these four user features exhibits the best experimental performance. By comparison, it is evident that all four features contribute positively to the overall model, and sentiment reversal and social relationships have stronger representational capabilities than the other features in Figure 7.

The HMSN model uses two topic features: meta-path similarity feature

H

and topic characteristic feature

O_{v}

. In this study, each of these features is systematically excluded, denoted as

H

w/o and

O_{v}

w/o, respectively. The relationship mining model is then implemented using the remaining feature, and the experimental results are illustrated in Figure 8.

It can be observed that the HMSN model constructed with both topic features achieves the best experimental results. A comparative analysis further reveals that both features contribute significantly to the overall model performance. Additionally, the meta-path similarity feature

H

exhibits stronger representation capabilities compared to the other features.

5. Conclusions

This paper reframes the sentiment analysis problem by conceptualizing it as a link polarity prediction challenge. Through the capture of latent relationships between nodes, facilitated by easily obtainable network features, the study derives emotional embeddings for the nodes, thereby predicting emotional polarity indicative of sentiment between them. The investigation places emphasis on user behavior, delving deeply into the emotional attitudes conveyed through user interactions. Furthermore, it explores the various factors influencing attitudes between users and topics.

This paper addresses the sentiment analysis challenge from user and topic perspectives by constructing a heterogeneous signed network that facilitates learning emotional representations. A layered design graph embedding model is devised to acquire emotional embeddings, simulate emotion propagation, and draw inspiration from social homophily theory. The study employs loss optimization to refine emotional representations, exploiting the unique characteristics of signed networks. These final representations are then employed to predict user sentiments toward specific topics.

The HMSN introduces the current state of research and fundamental approaches in sentiment analysis within social networks, analyzing the shortcomings of existing methods and suggesting directions for improvement. This paper proposes two research directions based on these insights: relationship prediction in social networks and sentiment prediction using network embedding.

The HMSN tackles sentiment analysis from a novel perspective, transforming it into a problem of predicting sentiment-signified linkages. In this context, we introduce a new relationship mining scheme as an enhancement. However, there are still areas ripe for exploration: the mining of structural features is relatively superficial, and there is potential to integrate higher-order neighbors for global structural feature-based relationship mining. For methods predicting sentiment signs, it would be beneficial to consider alternative approaches beyond node similarity and conduct comparative experiments. Since our focus was on predicting sentiment-signified links, we overlooked the existence of sentiment links, categorizing all node relationships strictly as positive or negative. This approach needs refinement. Considering neutral or non-linked node relationships would more accurately reflect the range of emotional attitudes between entities in the real world, requiring further improvement in the methodology.

Author Contributions

Conceptualization, Q.Z. and J.H.; methodology, Q.Z.; software, J.H.; validation, C.Y. and J.H.; formal analysis, C.Y.; investigation, J.H.; resources, J.L.; data curation, J.L.; writing—original draft preparation, C.Y.; writing—review and editing, Q.Z.; visualization, C.Y.; supervision, Q.Z.; project administration, J.L. and D.A.; funding acquisition, Q.Z. and D.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded in part by the National Key Research and Development Program of China under Grant No. 2022YFB4501704, in part by the National Natural Science Foundation of China under Grant No. 62302308, 62372300, 61702333 and U2142206, and in part by the Shanghai Sailing Program under Grant No. 21YF1432900.

Data Availability Statement

Data are contained within the article.

Acknowledgments

This work was supported by the Shanghai Engineering Research Center of Intelligent Education and Big Data, and also by the Research Base of Online Education for Shanghai Middle and Primary Schools.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bie, Y.; Yang, Y. A multitask multiview neural network for end-to-end aspect-based sentiment analysis. Big Data Min. Anal. 2021, 4, 195–207. [Google Scholar] [CrossRef]
Groh, G.; Hauffa, J. Characterizing social relations via nlp-based sentiment analysis. In Proceedings of the International AAAI Conference on Web and Social Media, Barcelona, Spain, 17–21 July 2011; Volume 5, pp. 502–505. [Google Scholar]
Tubishat, M.; Idris, N.; Abushariah, M.A. Implicit aspect extraction in sentiment analysis: Review, taxonomy, oppportunities, and open challenges. Inf. Process. Manag. 2018, 54, 545–563. [Google Scholar] [CrossRef]
Fang, Z.; Zhang, Q.; Tang, X.; Wang, A.; Baron, C. An implicit opinion analysis model based on feature-based implicit opinion patterns. Artif. Intell. Rev. 2020, 53, 4547–4574. [Google Scholar] [CrossRef]
Chen, H.Y.; Chen, H.H. Implicit polarity and implicit aspect recognition in opinion mining. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, Germany, 7–12 August 2016; pp. 20–25. [Google Scholar]
Liao, J.; Wang, S.; Li, D. Identification of fact-implied implicit sentiment based on multi-level semantic fused representation. Knowl.-Based Syst. 2019, 165, 197–207. [Google Scholar] [CrossRef]
Huang, S.; Zhao, Q.; Xu, X.Z.; Zhang, B.; Wang, D. Emojis-based recurrent neural network for Chinese microblogs sentiment analysis. In Proceedings of the 2019 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), Zhengzhou, China, 6–8 November 2019; pp. 59–64. [Google Scholar]
Ouyang, X.; Zhou, P.; Li, C.H.; Liu, L. Sentiment analysis using convolutional neural network. In Proceedings of the 2015 IEEE International Conference on Computer and Information Technology, Ubiquitous Computing and Communications, Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, Liverpool, UK, 26–28 October 2015; pp. 2359–2364. [Google Scholar]
Yuan, W.; He, K.; Han, G.; Guan, D.; Khattak, A.M. User behavior prediction via heterogeneous information preserving network embedding. Future Gener. Comput. Syst. 2019, 92, 52–58. [Google Scholar] [CrossRef]
Flesch, B.J. Social Interaction Model. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 3656–3658. [Google Scholar]
Guerrero-Solé, F. Interactive Behavior in Political Discussions on Twitter: Politicians. Soc. Media Soc. 2018. [Google Scholar] [CrossRef]
Zou, X.; Yang, J.; Zhang, W.; Han, H. Collaborative community-specific microblog sentiment analysis via multi-task learning. Expert Syst. Appl. 2021, 169, 114322. [Google Scholar] [CrossRef]
Zhao, Q.; Liu, G.; Yang, F.; Yang, R.; Kou, Z.; Wang, D. Self-Supervised Signed Graph Attention Network for Social Recommendation. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–9. [Google Scholar]
Zhao, Q.; Huang, J.; Liu, G.; Miao, Y.; Wang, P. A Multiinterest and Social Interest-Field Framework for Financial Security. IEEE Trans. Comput. Soc. Syst. 2023, 1–11. [Google Scholar] [CrossRef]
Zhao, Q.; Yang, F.; An, D.; Lian, J. Modeling Structured Dependency Tree with Graph Convolutional Networks for Aspect-Level Sentiment Classification. Sensors 2024, 24, 418. [Google Scholar] [CrossRef]
Das, S.R.; Chen, M.Y. Yahoo! for Amazon: Sentiment parsing from small talk on the web. In For Amazon: Sentiment Parsing from Small Talk on the Web (August 5, 2001); EFA: Bowie, MD, USA, 2001. [Google Scholar]
Muhammad, A. Contextual Lexicon-Based Sentiment Analysis for Social Media. Ph.D. Thesis, Université Robert Gordon University, Aberdeen, UK, 2016. [Google Scholar]
Speriosu, M.; Sudan, N.; Upadhyay, S.; Baldridge, J. Twitter polarity classification with label propagation over lexical links and the follower graph. In Proceedings of the First Workshop on Unsupervised Learning in NLP, Edinburgh, UK, 30 July 2011; pp. 53–63. [Google Scholar]
Zhao, Q.; Wang, C.; Wang, P.; Zhou, M.; Jiang, C. A novel method on information recommendation via hybrid similarity. IEEE Trans. Syst. Man Cybern. Syst. 2016, 48, 448–459. [Google Scholar] [CrossRef]
Sapountzi, A.; Psannis, K.E. Social networking data analysis tools & challenges. Future Gener. Comput. Syst. 2018, 86, 893–913. [Google Scholar]
Zhao, Q.; Zhou, Z.; Li, J.; Jia, S.; Pan, J. Time-Dependent Prediction of Microblog Propagation Trends Based on Group Features. Electronics 2022, 11, 2585. [Google Scholar] [CrossRef]
Tan, C.; Lee, L.; Tang, J.; Jiang, L.; Zhou, M.; Li, P. User-level sentiment analysis incorporating social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1397–1405. [Google Scholar]
Kuo, Y.H.; Fu, M.H.; Tsai, W.H.; Lee, K.R.; Chen, L.Y. Integrated microblog sentiment analysis from users’ social interaction patterns and textual opinions. Appl. Intell. 2016, 44, 399–413. [Google Scholar] [CrossRef]
Ren, F.; Wu, Y. Predicting user-topic opinions in twitter with social and topical context. IEEE Trans. Affect. Comput. 2013, 4, 412–424. [Google Scholar] [CrossRef]
Kim, J.; Yoo, J.; Lim, H.; Qiu, H.; Kozareva, Z.; Galstyan, A. Sentiment prediction using collaborative filtering. In Proceedings of the International AAAI Conference on Web and Social Media, Cambridge, MA, USA, 8–11 July 2013; Volume 7, pp. 685–688. [Google Scholar]
Smith, L.M.; Zhu, L.; Lerman, K.; Kozareva, Z. The role of social media in the discussion of controversial topics. In Proceedings of the 2013 International Conference on Social Computing, Alexandria, VA, USA, 8–14 September 2013; pp. 236–243. [Google Scholar]
Eliacik, A.B.; Erdogan, N. Influential user weighted sentiment analysis on topic based microblogging community. Expert Syst. Appl. 2018, 92, 403–418. [Google Scholar] [CrossRef]
Nozza, D.; Maccagnola, D.; Guigue, V.; Messina, E.; Gallinari, P. A latent representation model for sentiment analysis in heterogeneous social networks; Revised Selected Papers 12. In Proceedings of the Software Engineering and Formal Methods: SEFM 2014 Collocated Workshops: HOFM, SAFOME, OpenCert, MoKMaSD, WS-FMDS, Grenoble, France, 1–2 September 2014; Springer: Berlin/Heidelberg, Germany, 2015; pp. 201–213. [Google Scholar]
Cheng, K.; Li, J.; Tang, J.; Liu, H. Unsupervised sentiment analysis with signed social networks. In Proceedings of the AAAI Conference on Artificial Intelligence, London, UK, 30–31 May 2017; Volume 31. [Google Scholar]
Beklaryan, A.; Akopov, A.S. Simulation of agent-rescuer behaviour in emergency based on modified fuzzy clustering. In Proceedings of the AAMAS’16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, Singapore, 9–13 May 2016; pp. 1275–1276. [Google Scholar]
Wang, L.; Niu, J.; Yu, S. SentiDiff: Combining textual information and sentiment diffusion patterns for Twitter sentiment analysis. IEEE Trans. Knowl. Data Eng. 2019, 32, 2026–2039. [Google Scholar] [CrossRef]
Naskar, D.; Singh, S.R.; Kumar, D.; Nandi, S.; de la Rivaherrera, E.O. Emotion dynamics of public opinions on twitter. ACM Trans. Inf. Syst. (TOIS) 2020, 38, 1–24. [Google Scholar] [CrossRef]
Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
Huiting, L.; Yan, C.; Huihui, X. Matrix factorization recommendation algorithm based on users’ preference. J. Comput. Appl. 2015, 2021, 6610645. [Google Scholar]
Cao, Q.; Shen, H.; Cen, K.; Ouyang, W.; Cheng, X. Deephawkes: Bridging the gap between prediction and understanding of information cascades. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, New York, NY, USA, 6–10 November 2017; pp. 1149–1158. [Google Scholar]
Lim, D.; Hohne, F.; Li, X.; Huang, S.L.; Gupta, V.; Bhalerao, O.; Lim, S.N. Large scale learning on non-homophilous graphs: New benchmarks and strong simple methods. Adv. Neural Inf. Process. Syst. 2021, 34, 20887–20902. [Google Scholar]
Leskovec, J.; Huttenlocher, D.; Kleinberg, J. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Atlanta, GA, USA, 10–15 April 2010; pp. 1361–1370. [Google Scholar]

Figure 1. HMSN model architecture.

Figure 2. Diagram illustrating emotional reversal.

Figure 3. Heterogeneous signed information network.

Figure 4. The data distribution of emotional reversal and following relationships.

Figure 5. The distribution of emotional reversal data in user attention under different user interaction volumes, with Weibo data on the left and Twitter data on the right.

Figure 6. The correlation coefficient between emotional reversal and attention under different user interaction volumes, with Weibo data on the left and Twitter data on the right.

Figure 7. Results of the ablation experiments on the HMSN Model.

Figure 8. Results of the model ablation experiments.

Table 1. Retweet interactions in different datasets.

Number of interactions		2	3	4	5	6	7	8	>8
Weibo	14435	5775	2176	553	131	61	40	26	54
Twitter	27582	8832	3410	1519	718	368	227	128	316

Table 2. User relationship prediction on the Weibo dataset.

Method	Metircs	Node2Vec	SDNE	MF	LP	HMSN
Weibo	AUC	0.7497	0.7592	0.7831	0.7774	0.8033
Weibo	Aurracy	0.7634	0.7801	0.8063	0.7983	0.8143

Table 3. Topic relationship prediction on the Wiki dataset and the Slashdot dataset.

Method	Metircs	Node2Vec	SDNE	MF	LP	HMSN
Wiki	AUC	0.6834	0.6938	0.7438	0.7232	0.7661
Wiki	Aurracy	0.7193	0.7402	0.7864	0.7439	0.8004
Slashdot	AUC	0.6559	0.6675	0.6988	0.7037	0.7139
Slashdot	Aurracy	0.6941	0.7135	0.7392	0.7383	0.7480

Table 4. Experimental results of signed network embedding models on the Wiki and Slashdot datasets (Node Classification Task).

	Wiki			Slashdot
	Macro f1	Micro f1	AUC	Macro f1	Micro f1	AUC
SigNet	0.7002	0.8139	0.8198	0.7155	0.8009	0.8340
SiGAT	0.7223	0.8361	0.8537	0.7487	0.8437	0.8698
SDGNN	0.7512	0.8541	0.8656	0.7555	0.8502	0.8712
HMSN	0.7689	0.8612	0.8832	0.7801	0.8622	0.8817

Table 5. Experimental results on the Wiki dataset (Link Prediction Task).

Metrics	SiHet	NESA	HMSN
AUC	0.8223	0.8362	0.8398
Accuracy	0.8023	0.8102	0.8239

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, Q.; Yu, C.; Huang, J.; Lian, J.; An, D. Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network. Mathematics 2024, 12, 331. https://doi.org/10.3390/math12020331

AMA Style

Zhao Q, Yu C, Huang J, Lian J, An D. Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network. Mathematics. 2024; 12(2):331. https://doi.org/10.3390/math12020331

Chicago/Turabian Style

Zhao, Qin, Chenglei Yu, Jingyi Huang, Jie Lian, and Dongdong An. 2024. "Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network" Mathematics 12, no. 2: 331. https://doi.org/10.3390/math12020331

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sentiment Analysis Based on Heterogeneous Multi-Relation Signed Network

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Method

3.2. User Relationship Prediction Based on Emotional Reversal

3.3. Emotional Reversal Theory

3.4. Calculating the Probability of Emotional Reversal for Users

3.5. Sentiment Reversal Feature and User Relationships

3.6. Unsigned Social Relationship Feature

3.7. Fusion of Multi-Feature Relationship Prediction

Topic Relationship Prediction Based on Meta-Path Similarity

4. Experiments

4.1. Baselines

4.2. Dataset

4.3. Result

4.3.1. Prediction of User Relationships and Topic Associations

4.3.2. Signed Network Embedding Models (Node Classification Task)

4.3.3. Heterogeneous Signed Network Model (Link Prediction Task)

4.3.4. Ablation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI