1. Introduction
Many social relationships could be represented by assigned networks with positive or negative sign edges. Positive edges in a signed network represent positive ties, such as friends, trust, and support, whereas negative edges represent negative relationships, such as enemies, mistrust, and resistance [
1]. In addition, complicated systems exist where positive and negative interactions are implicit. For example, a hyperlink on a website may convey acceptance or disapproval of the target page, depending on the semantics of both pages. It has been proven that such invisible negative links are predictable, and they can convey significantly different feature information than positive links [
2]. It is increasingly recognized that mining adversarial relationships in social systems has a wide range of applications in social network analysis, such as social sentiment analysis [
3,
4,
5] and relationship recommendation [
6].
Network embedding refers to learning a fixed-length, low-dimensional vector for each node. It is essential for a variety of social network analysis tasks, such as link prediction, node classification, etc., and it has gained interest in data mining and social computing [
7,
8,
9,
10,
11,
12]. Recently, network embedding has been greatly improved by graph neural networks (GNNs) because of their tremendous end-to-end modeling capabilities. By combining graph theory and the convolution theorem, Bruna [
13] developed the first graph convolutional neural (GCN) network. However, this is hampered by a high level of temporal complexity. In order to address this issue, ChebNet [
14] and GCN suggest parametrizing the convolution kernel in spectral approaches to minimize temporal complexity. Despite being spectral approaches, these two have started spatially defining node weight matrices. Inspired by these major approaches, spatial methods began to represent node weights using the attention mechanism, sample mechanism, and aggregate mechanism.
However, traditional GCNs cannot be used on signed networks [
15,
16] since there exist specific principles led by social theory, such as “the enemy of my enemy is my friend”, which demonstrates that negative links can have a big impact on the quality of node embeddings. For the purpose of accurately locating the role of the negative links, many tremendous efforts have been devoted to designing network embedding methods, especially for signed networks. For example, SiNE [
17] (Signed Network Embedding) learned a low-dimensional vector embedding by using structure balance theory in signed networks. SGCN [
18] expanded GCN to signed networks and designed both positive and negative aggregators for the purpose of generating node embeddings based on balance theory. SiGAT [
19] integrated attention mechanisms to signed directed networks and built a motifs-based graph neural network model. However, few known approaches take the hierarchical graph pooling of the networks into consideration, which is critical for improving the representation power of a graph, in light of the fact that hierarchical graph pooling can enhance representation learning with high-order structural information.
Motivated by these analyses, a novel signed graph representation framework SHGP to learn Signed network embedding with hierarchical graph pooling is proposed. Instead of only applying aggregators when aggregating information from neighboring nodes, SHGP employs a pooling operator to learn the hierarchical graph pooling of real signed graphs and use the learned hierarchical graph pooling also as part of the node representation.
Our main contributions are threefold:
A graph pooling layer is introduced to learn effective signed network embedding, which utilizes a pooling mechanism to perform standard global pooling. Further, the essence of this pooling mechanism is to pick a subset of critical nodes to enable the encoding of high-level features and the hierarchical graph pooling of signed networks.
An objective function that considers signed link prediction and structural balance theory is designed to optimize the framework and learn node representations.
Extensive experiments on three real-world signed network datasets show the effectiveness of the proposed SHGP framework through the signed link prediction task.
This paper’s structure continues below.
Section 2 reviews signed network embedding research.
Section 3 provides an explanation of the notations and a priori information of the research content.
Section 4 outlines the SHGP framework.
Section 5 is dedicated to analyzing the proposed method. Finally, we provide the paper’s conclusion.
3. Problem Definition
To facilitate presenting, we begin by introducing the primary notations and definitions in this paper. Given a signed network G = (V, E), which is constructed of a set of N nodes, the set of positive links between nodes can be expressed as E+, and the set of negative links can be denoted as E−. Note that . The set of positive and negative neighbors of a node can be expressed as and , respectively.
Similarly, we provide the definition of the set of balance (unbalance) neighbor nodes: in a signed network, if node are connected by an L-hop path, then belongs to the set of balance (unbalance) neighbor nodes if there are an odd (even) number of negative links on the L.
The two triangles on the left of
Figure 1. are structural balance triangles. In contrast, in the unbalanced structural triangles on the right, the relationship between a pair of nodes is both friend and foe. Moreover, structural balance in signed networks allows nodes from different node sets to express different features in the embedding space, which means the information aggregated from a balanced node set conveys different characteristics than an unbalanced node set. To solve this problem, we use balance embedding and unbalance embedding to represent one node. In other words, two different types of embedding are leveraged to aggregate the features from two sets of nodes.
4. Proposed Signed Network Embedding with Hierarchical Graph Pooling
In this part, SHGP is proposed to learn signed network embeddings. Before formalizing the SHGP, we explain its construction. Firstly, traditional GCNs (with only positive links) generate a node representation through aggregate node local neighbors’ information and then use the aggregation function to combine the original feature to update the feature of the current node. However, signed networks have valence signs (+, −, 0) on their edges, which carry different information from positive ones. Therefore, different aggregation functions are needed to aggregate features from different types of neighbors. Moreover, hierarchical graph pooling plays a crucial role in network embedding, which has been proven to boost numerous network analysis tasks such as link prediction. As shown in
Figure 2, two different aggregators (the red one is the friend relation aggregator while the green one is the foe aggregator) are utilized to combine information from nodes in neighboring neighborhoods. The top of the diagram indicates the hierarchical graph pooling aggregation process, and the white nodes indicate the higher-order nodes involved in the aggregation. The global vector Zp together with the convolution vector Zc form the final node embedding vector Z. In addition, a pooling operation is utilized to aggregate hierarchical graph pooling from the signed graph, which computes the various significance coefficients of the neighbors (including high-order neighbors) and generates graph-level embeddings of the nodes. Finally, the signed link prediction results are obtained by the logistic regression classifier. The body of this section is structured as follows: we begin by discussing how to sample and propagate information from a signed network. Then, we explain in depth how hierarchical graph pooling may be utilized to manage the relevance of node embeddings. We conclude with an explanation of how to train this model and perform practical tasks in the signed networks.
4.1. Local Signed Graph Convolution
In general, most graph neural networks can be considered message-passing neural networks (MPNNs). The core of MPNNs is the definition of aggregation function and update function between nodes. To begin, the local structural expression of each node is obtained by applying the aggregation function to it and its neighboring nodes. Second, the current node’s representation is updated using the update function and the local structural representation. The general expression of the MPNNs can be expressed as follows:
where
denotes the hidden layer representation of node
i at
t-th steps,
is the features of a given link,
represents the aggregate function at
t-th steps,
means the local structure representation of node
i after aggregating, and
stands for the update function. By designing appropriate sampling and aggregation functions, such as weighted aggregation or mean aggregation, the target node accepts the features passed from its own neighborhood nodes and completes an update of its own features through feature fusion of the local structure to obtain a new feature representation.
A signed network is a specialized type of network that contains type information on its edges. It not only includes two types of connected edges (positive links and negative links) but also has special sociological properties such as structural balance. In particular, the fact that my enemy’s enemy is my friend (a foe node two hops from the central node is a friend) makes it infeasible to define aggregation functions only based on edge type. As a result, we use two different GNN aggregators in this paper to aggregate different information from
and
In the first aggregation layer, given the initial feature
of node
i, we can generate the balanced embedding
and unbalanced embedding
:
where
() is the nonlinear activation function,
is the aggregate operation for aggregating feature information from node pairs,
,
refers to the linear transformation matrices responsible for the information aggregated from
and
. Due to the fact that the first layer of the model can only portray first-order neighbors, there is no structural balance, and friends or enemies can be obtained by direct aggregation. However, from the second layer of the model, the friend representation of node
i will be acquired by the aggregator from its own friends, its own friends’ friends, and its own enemies’ enemies based on balance theory.
For the deeper aggregation layers (
l > 1), it can be recursively defined as:
where
,
is the shared weight matrix. When the number of aggregation layers is greater than two, the balanced embedding of node
should aggregate the information not only from the balanced node set but also nodes from the unbalanced node set, whose relationship is enemy’s enemy.
4.2. Global Signed Graph Aggregation
Hierarchical graph pooling improves network embedding through graph pooling techniques. Convolutional neural networks for Euclidean data use pooling layers. They minimize the feature graph size and broaden the perceptual field, improving feature generalization and extraction. Graph pooling layers, which are similar to standard convolutional neural networks, are proposed for the generation of graph-level representations. Early graph pooling layers were typically constructed to be flat. They built graph-level representations straight from node-level representations in a single operation. As an illustration, a graph neural network might apply either an average pooling or a maximum pooling to each feature channel in order to create a graph-level representation. Hierarchical graph pooling was later created with the goal of capturing the information contained in a graph by gradually coarsening the original graph, which was developed later. Inspired by the above theory, we designed a hierarchical graph pooling layer on signed networks, which is applied to learn the importance of different nodes from different node sets (balanced node set and unbalanced node set). The operation selects a subset of nodes to form a new but small graph in a signed network when aggregating and propagating information. As shown in
Figure 3, given the input graph with four nodes, each of them has three features. Then we can obtain the input feature matrix
. By adding the trainable projection vector P, the features of the nodes can be mapped to 1D. Then, the top-k nodes with high scores are selected with the help of a sigmoid function, and the index information is recorded. Finally, the index selection process preserves the position order information in the original graph and the adjacency matrix of the new graph
and
can be obtained through the selected index.
As previously stated, node embeddings from the local graph convolution function are classified into two types: positive embedding hp and negative embedding hn. These two types of embeddings have different characteristics. Similarly, we performed separate pooling operations to extract the corresponding hierarchical graph pooling from each of the two types of node sets. In the pooling layer, the importance measure of the nodes is learned from the input node features
:
where
refers to the input matrix of node features, and
p represents the learning vector. After obtaining the important score y, we sort all of the nodes and select the top K importance node, it can be formulated as:
As far as we achieve the idx of the selected node, the graph structure and node features for the pooled graph are constructed on the basis of this information, and it can be determined by deducing the graph structure of the input graph based on these as follows:
where
() denotes the Sigmoid function that maps importance scores to (0,1).
is a vector with all elements being 1.
In this case, it is important to note that we utilize the same kind of nodes as input for the pooling mechanism, and we employ a common matrix to build the link between the balance and unbalance embedded representations. This is due to the fact that balanced embeddings and unbalanced embeddings have distinctly different physical meanings, respectively. Since different types of node sets imply antagonistic relationships such as “trust” or “distrust” [
26]. As a result, using the same type of embedding to estimate the importance ranking can provide a more appropriate estimation of the association between a pair of nodes.
4.3. The Objective Function
This section describes the objective function as well as the training details of SHGP. Positive links, negative links, and no links are all represented as
= {+,−,?}, where “no links” implies that there are no links between node
and
. In the hidden space, we choose to reduce the distance between positive node pairs as much as possible while increasing the distance between negative node pairs. The optimization problem is then transformed into a three-classification problem. We utilize the node mini-batch training approach to build a collection of edge triples, which we then use to test our hypothesis
.
consists of triplets of the form
, where
∈
s refers to which type of link exists between
and
. We use one hot to encode the type triples of the edges
as
∈ {0,1}
S, and then the cross-entropy error over
can be defined as follow:
where
denotes the parameters of the softmax regression classifier.
We construct a distinct weight for each link type
∈
s, depending on the amount of positive and negative connections in the signed networks, and produce “no link”, as stated in [
16]. This is undertaken because signed networks are sparse and contain an unbalance of positive and negative links.
According to the extended structural balancing theory, nodes with positive links are close, nodes with negative links are far, and node pairs without ties are in the middle. The triad-based objective function can be mathematically defined as follows:
where
and
are the sets for node pairs from
.
Based on the objectives of the edge signed classification and structural balance theory, the overall objective function can be defined as:
where
denotes the weight of different loss functions, and
denotes the variable regularizer of our proposed framework.
5. Experiment
In this part, the effectiveness of SHGP is evaluated in multiple phases. The experimental conditions are presented first. Next, we show a sensitivity analysis of SHGP parameters. Finally, the quality of the node embeddings using SHGP to the baseline approach is compared.
5.1. Simulation Setup
The SHGP is analyzed using Bitcoin-Alpha, Bitcoin-OTC, and Wikirfa [
4]. Bitcoin-Alpha and Bitcoin-OTC accept Bitcoin. When trading Bitcoin on the site, users must maintain a good reputation to avoid fraudulent and hazardous transactions. Members can rate each other from −10 (total distrust) to +10 (full trust) in 1-point increments. Note that we treat the score lower than 0 as negative and beyond 0 as positive. Wifirfa is a dataset that includes the Wikipedia admin election data. For an editor in Wikipedia to be promoted to the position of administrator they must make a request for adminship (RfA), and other Wikipedia members may vote in favor, neutrality, or opposition to their request. In
Table 1, a complete description of the dataset is shown.
In trials, two unsigned and three signed network embedding methods were compared to the proposed method to illustrate its superiority. In the unsigned network experiment, we removed the negative links in the training stage since these methods cannot distinguish between positive links and negative links:
Deepwalk [
18]: This technique simulates text creation by supplying a succession of nodes using random walk paths on the network;
Node2vec [
25]: This approach modifies the DeepWalk algorithm’s random walk sequence. It provides width-first and depth-first search by using two parameters, p and q;
SiNE [
15]: It uses a multilayer neural network for representation learning of nodes based on triangle relations to extract similarities and dissimilarities between nodes;
SIDE [
19]: A strategy of random walk generation of node sequences is used with indirect signed connections to encode structural information into node embeddings learning;
SGCN [
16]: It uses balance theory to generate node embeddings by designing two node aggregators to aggregate and propagate information through the graph convolution layer.
The final embedding dimension for all of the approaches is set to 64 in order to make a fair comparison. For SiNE, SIDE, and SGCN, we utilize the hyperparameters and settings that were proposed in the respective articles. After obtaining the embeddings of each node, we synthesize the embedding of the two nodes into a concatenated representation and then put the concatenated representation into a logistic regression classifier to obtain the final result of signed link prediction. Our models are implemented by Pytorch = 1.9.1 with Adam optimizer with the learning rate at 0.0001. For the three real datasets, we choose 80% as the training set and 20% as the validation set to test their quality.
5.2. Parameter Analysis
This part analyses the hyperparameters of the experiment, including the relationship between epoch and AUC, F1, and the relationship between the number of aggregated layers and AUC. We performed experiments on all three datasets, and because of the space limitations, we will only discuss the performance of the parameters on the BitcoinOTC dataset. From
Figure 4 and
Figure 5, we can notice that our model produced some oscillations in the early stages of training due to the randomization of parameters, and when the epoch was greater than 25, the model training stabilized and converged quickly. After the epoch greater than 100, the predictions are stable and will not change significantly.
When discussing the number of aggregation layers, we have normalized the range of values for l to be from 1 to 5. As can be seen in
Figure 5, the AUC gradually increases as the number of layers increases from 1 to 3 and begins to decrease as it increases to 4. This demonstrates that if the network is aggregated lower down, the effect is significantly worse, as the effect of higher-order neighbors on oneself is relatively small. So we set L = 3. In this work, we set k = 3 since there is a GCN layer before each pooling layer to aggregate information from its first-order neighboring nodes.
5.3. Comparion with State-of-the-Art
Signed link prediction is a downstream task of node embedding that we utilize to gauge vector quality. Signed link prediction predicts an edge’s sign. We represent link features using two node embedding representations. Signed link prediction can be reduced to categorizing the positive and negative linkages. The performance of the binary classifier will be assessed by the Area Under Curve (AUC) and F1-score metrics. Due to the lack of feature information about the nodes in the real dataset, we use the embeddings obtained in the TVSD method as the initial input for our node features. The results of the AUC and F1-score for the five node embedding methods on three different datasets are shown in
Table 2 and
Table 3, respectively. We can see that:
For the unsigned network approach, even considering only positive links, the metrics are seen to improve, indicating that network structure plays a crucial role and that Node2vec achieves the best results in this class of methods. After taking the negative links into account, SiNE, SIDE, and SGCN significantly improve the prediction results over other unsigned network methods. These methods combine sociological theory with network embedding and have achieved good results in signed network analysis. SGCN had better performance in the experiment, indicating that graph convolutional neural networks have powerful abilities in feature aggregation.
SHGP achieves a significant performance improvement over these baseline methods on all network datasets. SHGP outperforms all these methods in terms of AUC and F1- score in comparison with other baseline methods. It can be shown that the performance is improved from SGCN on three real-world datasets with 5.3%, 3.7%, and 1.6% when a graph pooling layer is added to the network. This highlights the necessity and efficacy of applying graph pooling procedures as well as balance theory to achieve the desired results.