Next Article in Journal
New Obstructions to Warped Product Immersions in Complex Space Forms
Next Article in Special Issue
An Intelligent Vision-Based Tracking Method for Underground Human Using Infrared Videos
Previous Article in Journal
Fledgling Quantum Spin Hall Effect in Pseudo Gap Phase of Bi2212
Previous Article in Special Issue
Multi-Type Object Tracking Based on Residual Neural Network Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hypernetwork Representation Learning with Common Constraints of the Set and Translation

1
Department of Computer Technology and Application, Qinghai University, Xining 810000, China
2
State Key Laboratory of Tibetan Intelligent Information Processing and Application, Qinghai Normal University, Xining 810000, China
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(8), 1745; https://doi.org/10.3390/sym14081745
Submission received: 7 July 2022 / Revised: 16 August 2022 / Accepted: 18 August 2022 / Published: 22 August 2022

Abstract

:
Different from conventional networks with only pairwise relationships among the nodes, there are also complex tuple relationships, namely the hyperedges among the nodes in the hypernetwork. However, most of the existing network representation learning methods cannot effectively capture the complex tuple relationships. Therefore, in order to resolve the above challenge, this paper proposes a hypernetwork representation learning method with common constraints of the set and translation, abbreviated as HRST, which incorporates both the hyperedge set associated with the nodes and the hyperedge regarded as the interaction relation among the nodes through the translation mechanism into the process of hypernetwork representation learning to obtain node representation vectors rich in the hypernetwork topology structure and hyperedge information. Experimental results on four hypernetwork datasets demonstrate that, for the node classification task, our method outperforms the other best baseline methods by about 1%. As for the link prediction task, our method is almost entirely superior to other baseline methods.

1. Introduction

The goal of network representation learning, also regarded as network embedding, is to map each node to a low-dimensional representation vector space. The node representation vector can be applied to some popular network analysis tasks, such as node classification [1], link prediction [2], and community detection [3].
According to the type of network, network representation learning is divided into conventional network representation learning and hypernetwork representation learning. As for conventional network representation learning, most of the related studies only take the network topology structure as input to learn node representation vectors, such as DeepWalk [4], node2vec [5], LINE [6], GraRep [7], and HOPE [8]. Nevertheless, the node representation vectors learnt only from the network topology structure are not desirable vectors. Hence, some researchers have proposed some methods to incorporate other types of supplementary information, such as text, label, and community, into the process of network representation learning, such as CANE [9] and CNRL [10].
Nevertheless, the above network representation learning methods are designed for conventional networks with pairwise relationships.
As for the hypernetwork, hypernetwork representation learning [11] has been gradually widely studied by researchers. According to the characteristics of hypernetwork representation learning methods, they are divided into expanded spectral analysis and non-expanded methods. The expanded spectral analysis methods, such as star and clique extensions [12], transform the hypernetwork into a conventional network to learn the node representation vector while losing hyperedge information during the hypernetwork expansion. The non-expanded methods without the hyperedge decomposition are mainly divided into non-expanded spectral analysis and neural-network-based methods, such as Hyper2vec [13], HPHG [14], DHNE [15], and so on.
Although the expanded spectral analysis methods are intuitive, there is a loss of hyperedge information. The non-expanded methods do not decompose the hyperedge. For example, Hyper2vec captures the pairwise relationships among the nodes on the hyperedge-based walk sequence but does not capture the tuple relationships among the nodes well. HPHG, combined with a one-dimensional convolutional layer, effectively captures tuple relationships among the nodes, and DHNE captures tuple relationships among the nodes by combining multi-layer perceptron, but both HPHG and DHNE are limited to heterogeneous hyperedges with a fixed size. However, the above methods cannot effectively capture the complex tuple relationships with an unfixed size. Therefore, in order to resolve the above challenge, this paper proposes a hypernetwork representation learning method with common constraints of the set and translation to effectively capture tuple relationships among the nodes.
The following two points are the main characteristics of this paper:
  • The hypernetwork was transformed into a conventional network abstracted as a two-section graph. Based on this conventional network, a hypernetwork representation learning method with common constraints of the set and translation was proposed to learn node representation vectors rich in both the hypernetwork topology structure and hyperedges.
  • The strength of our proposed method was to incorporate a hyperedge (tuple relationship) that is not limited to a fixed size into the process of hypernetwork representation learning. The weakness of our proposed study was that some hypernetwork structure information was still missing because the hypernetwork was transformed into a conventional network.

2. Related Studies

Different from the conventional network with only pairwise relationships among the nodes, there are also complex tuple relationships, namely the hyperedges among the nodes in the hypernetwork. However, most of the existing network representation learning methods cannot effectively capture the complex tuple relationships. Therefore, in order to resolve the above challenge, researchers have proposed some hypernetwork representation learning methods, which were divided into the expanded spectral analysis and non-expanded methods. As for the expanded spectral analysis methods, by transforming the hypernetwork into a conventional network, the problem of hypernetwork representation learning was simplified into the problem of conventional network representation learning, and then solved according to the spectral characteristics of the Laplace matrix. For example, star and clique extensions are two classical hypernetwork expansion methods. As for the non-expanded methods, they are mainly divided into the non-expanded spectral analysis and the neural network-based methods. The non-expanded spectral analysis methods directly model the hypernetwork, that is, the Laplacian matrix on the hypernetwork is directly built, and this modeling process ensures the integrity of hypernetwork information. For example, Zhou [16] extended the powerful method of spectral clustering [17], originally run on undirected graphs, to the hypergraph [18] and further developed the hypergraph learning algorithm on the basis of the spectral hypergraph clustering method. Hyper2vec was proposed based on the biased random walk strategy on the hypergraph to preserve the structure and inherent property of the hypernetwork. Neural-network-based methods have a strong learning ability, flexible structure design, and high generalization, which make up for the defects of spectral analysis methods. For example, for DHNE, it was theoretically proved that the linear similarity measure in the embedding space used by the existing methods could not preserve the indecomposability of the hypernetwork. Thus, a new deep model was proposed to realize the local and global proximity of the nonlinear tuple similarity function in the embedding space. HPHG designs a random walk based on the hypergraph to retain the hypernetwork topology structure information to learn node representation vectors. Hyper-SAGNN [19] uses a self-attention mechanism [20] to aggregate hypergraph information, constructs pairwise attention coefficients between the nodes as the dynamic features of the nodes, and combines the original static features of the nodes to describe the nodes.

3. Problem Definition

Given the hypernetwork H = ( V , E ) , abstracted as the hypergraph, which was composed of the node set V = { v i } i = 1 | V | and the hyperedge set E = { e i = ( v 1 , v 2 , , v m ) } i = 1 | E |   ( m 2 ) , the goal of hypernetwork representation learning, with common constraints of the set and translation, was to learn a low-dimensional vector r n R k for each node n in the hypernetwork, where k was expected to be much smaller than | V | .

4. Preliminaries

4.1. Transforming Hypergraph into Two-Section Graph

A feasible way to transform the hypergraph into a conventional graph was to carry out the research of the hypergraph, because the research for the conventional graph was relatively mature. In the literature [18], hypergraphs were transformed into three kinds of conventional graphs, namely line, incidence, and two-section graphs. In fact, two-section graphs lost less hypernetwork structure information than line and incidence graphs. Hence, a hypergraph was transformed into a two-section graph in this study. A hypergraph and its corresponding two-section graph are shown in Figure 1.
The two-section graph S = ( V , E ) transformed from the hypergraph H = ( V , E ) was a conventional graph with the following conditions:
  • V = V , that is, the node set of two-section graph S was equal to the node set of the hypergraph H .
  • One edge was associated with any two different nodes if and only if the two nodes were simultaneously associated with at least one hyperedge.

4.2. TransE

Knowledge representation is the vectorization of the entity and relation in the knowledge graph, which specifically maps the entity or relation to a low-dimensional vector space. For simplicity, ( h , r , t ) denotes the triplet (head, relation, and tail), where h, r, and t denote the head entity, relation, and tail entity, respectively, and h, r, and t denote the vectors corresponding to the head entity, relation, and tail entity, respectively. In the relation extraction of the knowledge graph, as a knowledge representation learning algorithm based on the translation, TransE [21] thought that the head entity vector plus the relation vector were approximately equal to the tail entity vector, that is, h + r t when the triplet ( h , r , t ) held ( t should be the nearest neighbor of h + r ), while h + r should otherwise be far away from t . TransE is shown in Figure 2.

5. Our Method

Hypernetwork representation learning with common constraints of the set and translation HRST is introduced in detail in this section. Firstly, the topology-derived model is introduced in Section 5.1. Secondly, the set constraint model is introduced in Section 5.2. Thirdly, the translation constraint model is introduced in Section 5.3. Fourthly, the joint optimization of the above three models is introduced in detail in Section 5.4. Finally, the complexity analysis of HRST is introduced in Section 5.5.

5.1. Topology-Derived Model

Because the computational efficiency of CBOW [22] is greater than that of skip-gram [22], a topology-derived model [11] based on the negative sampling to be used to capture the network structure was introduced. To be specific, in the optimization procedure of this model, the center node n was the positive sample, other nodes were the negative samples, and N E G ( n ) was the subset of negative samples with a predefined size d s . For   u V , the node labels are denoted as follows:
L n ( u ) = 1 , u { n } 0 , u N E G ( n )
The prediction probability of the node u is denoted as p ( u | c o n t e x t ( n ) ) under the condition of the contextual nodes c o n t e x t ( n ) corresponding to n . The node sequence set is denoted as C . In view of the above conditions, we maximized the following objective function:
D 1 = n C u { { n } N E G ( n ) } p ( u | c o n t e x t ( n ) )
When the node n was regarded as the contextual node, the embedding vector v n was the representation of the node n , while the parameter vector θ n was the representation of the node n when the node n was regarded as the center node. p ( u | c o n t e x t ( n ) ) in the formula (2) is denoted as follows:
p ( u | c o n t e x t ( n ) ) = σ ( X n T θ u ) , L n ( u ) = 1 1 σ ( X n T θ u ) , L n ( u ) = 0
where σ ( X n T θ u ) = 1 / ( 1 + e X n T θ u ) is a sigmoid function and X n is the summing operation of the representation vectors corresponding to all the nodes of c o n t e x t ( n ) . Formula (3) can also be written as an integral expression:
p ( u | c o n t e x t ( n ) ) = [ σ ( X n T θ u ) ] L n ( u ) [ 1 σ ( X n T θ u ) ] 1 L n ( u )
Consequently, Formula (2) can be rewritten as follows:
D 1 = n C u { { n } N E G ( n ) } [ σ ( X n T θ u ) ] L n ( u ) [ 1 σ ( X n T θ u ) ] 1 L n ( u )
Formally, by means of maximizing D 1 , the network topology was encoded into the node representation vectors.

5.2. Set Constraint Model

Because the above topology-derived model only considered the network structure, a set constraint model [11] based on the negative sampling to consider both the network structure and the hyperedge was introduced. To be specific, in the optimization procedure of this model, T n was the set of the hyperedges associated with the center node n , and also the set of the nodes associated with the center node n if the hyperedge was regarded as the node. The center node n was the positive sample, and other nodes not associated with the center node n V were the negative samples. As for   v T n , N E G ( v ) was the subset of negative samples with a predefined size d s , and the node labels are denoted as follows:
δ ( ϑ v ) = 1 , ϑ { n } 0 , ϑ N E G ( v )
In view of the node sequences C and the set of the hyperedges, we tried to maximize the following objective function to meet the set constraint:
D 2 = n C v T n p ( n | v ) = n C v T n ϑ { { n } N E G ( v ) } σ ( e v T θ ϑ ) δ ( ϑ | v ) [ 1 σ ( e v T θ ϑ ) ] 1 δ ( ϑ | v ) = n C v T n σ ( e v T θ n ) ϑ N E G ( v ) [ 1 σ ( e v T θ ϑ ) ]
where e v is the parameter vector corresponding to v T n .
By means of maximizing D 2 , the hyperedges were encoded into the node representation vectors.

5.3. Translation Constraint Model

Because the above set constraint model did not fully consider the hyperedges, it could not learn node representation vectors very well. Hence, we tried to incorporate the hyperedges associated with the nodes, regarded as the interaction relationships among the nodes, into the process of hypernetwork representation learning.
Inspired by the successful application of the translation mechanism in TransE, the nodes and interaction relationships were mapped into a unified representation space, where the interaction relationships among the nodes could be regarded as the translation operations in the representation space.
To be specific, for the center node n in V , if there was a node h V and a hyperedge r E to make n r ,   h r , that is, the hyperedge r was simultaneously associated with the node n and node h , a normal triplet ( h , r , n ) held, where h is a node with the relationship r with the node n , H r is the set of the nodes with the relationship r with the node n , and R n is the set of hyperedges associated with the center node n , namely the set of relationships.
Inspired by the above topology-derived model, a novel translation constraint model based on the negative sampling was proposed. To be specific, in the optimization procedure of this model, the center node n was the positive sample, other nodes were the negative samples, and N E G ( n ) was the subset of negative samples of the center node n with a predefined size d s . For   ξ V , the node labels are denoted as follows:
δ n ( ξ ) = 1 , ξ { n } 0 , ξ N E G ( n )
In view of the node sequences C and the translation constraint, we tried to maximize the following objective function to meet the translation constraint:
D 3 = n C r R n h H r ξ { { n } N E G ( n ) } p ( ξ | h + r ) = n C r R n h H r ξ { { n } N E G ( n ) } σ ( e h + r T θ ξ ) δ n ( ξ ) [ 1 σ ( e h + r T θ ξ ) ] 1 δ n ( ξ )
where e h , e r , and e h + r are all the parameter vectors, e h + r = e h + e r . By means of maximizing D 3 , the interaction relations were encoded into the node representation vectors.

5.4. Joint Optimization

In this subsection, the hypernetwork representation learning method with common constraints of the set and translation HRST is proposed. HRST can jointly optimize the topology-derived, set constraint, and translation constraint models. Figure 3 shows the HRST framework.
In Figure 3, the network topology representation, and the hyperedge and relation representations from the topology-derived model, and the set constraint and the translation constraint models, respectively, shared the same representation rich in the hyperedges.
In order to facilitate calculation, we took the logarithm of D 1 , D 2 , and D 3 to maximize the following joint optimization objective function to meet common constraints of the set and translation:
L = n C u { { n } N E G ( n ) } L n ( u ) log [ σ ( X n T θ u ) ] + [ 1 L n ( u ) ] log [ 1 σ ( X n T θ u ) ] + β 1 v T n ϑ { { n } N E G ( v ) } δ ( ϑ v ) log [ σ ( e v T θ ϑ ) ] + [ 1 δ ( ϑ v ) ] log [ 1 σ ( e v T θ ϑ ) ] + β 2 r R n h H r ξ { { n } N E G ( n ) } δ n ( ξ ) log [ σ ( e h + r T θ ξ ) ] + [ 1 δ n ( ξ ) ] log [ 1 σ ( e h + r T θ ξ ) ] = n C u { { n } N E G ( n ) } L n ( u ) log [ σ ( X n T θ u ) ] + [ 1 L n ( u ) ] log [ 1 σ ( X n T θ u ) ] + v T n ϑ { { n } N E G ( v ) } β 1 δ ( ϑ v ) log [ σ ( e v T θ ϑ ) ] + [ 1 δ ( ϑ v ) ] log [ 1 σ ( e v T θ ϑ ) ] + r R n h H r ξ { { n } N E G ( n ) } β 2 δ n ( ξ ) log [ σ ( e h + r T θ ξ ) ] + [ 1 δ n ( ξ ) ] log [ 1 σ ( e h + r T θ ξ ) ]
where the harmonic factors β 1 and β 2 were used to counterweigh the contribution rate among the topology-derived, the set constraint, and the translation constraint models.
In order to facilitate derivation, L ( n , u , v , ϑ , r , h , ξ ) is denoted as follows:
L ( n , u , v , ϑ , r , h , ξ ) = L n ( u ) log [ σ ( X n T θ u ) ] + [ 1 L n ( u ) ] log [ 1 σ ( X n T θ u ) ] + β 1 δ ( ϑ v ) log [ σ ( e v T θ ϑ ) ] + [ 1 δ ( ϑ v ) ] log [ 1 σ ( e v T θ ϑ ) ] + β 2 δ n ( ξ ) log [ σ ( e h + r T θ ξ ) ] + [ 1 δ n ( ξ ) ] log [ 1 σ ( e h + r T θ ξ ) ]
The objective function L was optimized by the stochastic gradient ascent method. The objective was to give six kinds of gradients of L .
Firstly, the gradient on θ u of L ( n , u , v , ϑ , r , h , ξ ) was calculated as follows:
L ( n , u , v , ϑ , r , h , ξ ) θ u = L n ( u ) [ 1 σ ( X n T θ u ) ] X n [ 1 L n ( u ) ] σ ( X n T θ u ) X n = L n ( u ) [ 1 σ ( X n T θ u ) ] [ 1 L n ( u ) ] σ ( X n T θ u ) X n = [ L n ( u ) σ ( X n T θ u ) ] X n
Consequently, the updating formula of θ u is denoted as follows:
θ u = θ u + α [ L n ( u ) σ ( X n T θ u ) ] X n
where α is the learning rate.
Secondly, the gradient on X n of L ( n , u , v , ϑ , r , h , ξ ) was calculated. The symmetry property between θ u and X n was utilized to get the gradient of X n :
L ( n , u , v , ϑ , r , h , ξ ) X n = [ L n ( u ) σ ( X n T θ u ) ] θ u
Consequently, the updating formula of v v is denoted as follows, where v c o n t e x t ( n ) :
v v = v v + α u { { n } N E G ( n ) } L ( n , u , v , ϑ , r , h , ξ ) X n   =   v v + α u { { n } N E G ( n ) } [ L n ( u ) σ ( X n T θ u ) ] θ u
Thirdly, the gradient on θ ϑ of L ( n , u , v , ϑ , r , h , ξ ) was calculated as follows:
L ( n , u , v , ϑ , r , h , ξ ) θ ϑ = β 1 θ ϑ δ ( ϑ v ) log [ σ ( e v T θ ϑ ) ] + [ 1 δ ( ϑ v ) ] log [ 1 σ ( e v T θ ϑ ) ] = β 1 δ ( ϑ v ) [ 1 σ ( e v T θ ϑ ) ] e v [ 1 δ ( ϑ v ) ] σ ( e v T θ ϑ ) e v = β 1 δ ( ϑ v ) [ 1 σ ( e v T θ ϑ ) ] [ 1 δ ( ϑ v ) ] σ ( e v T θ ϑ ) e v = β 1 [ δ ( ϑ v ) σ ( e v T θ ϑ ) ] e v
Consequently, the updating formula of θ ϑ is denoted as follows:
θ ϑ = θ ϑ + α β 1 [ δ ( ϑ | v ) σ ( e v T θ ϑ ) ] e v
Fourthly, the gradient on e v of L ( n , u , v , ϑ , r , h , ξ ) was calculated. The symmetry property between θ ϑ and e v was utilized to get the gradient of e v :
L ( n , u , v , ϑ , r , h , ξ ) e v = β 1 [ δ ( ϑ | v ) σ ( e v T θ ϑ ) ] θ ϑ
Consequently, the updating formula of e v is denoted as follows, where v T n :
e v = e v + α β 1 [ δ ( ϑ | v ) σ ( e v T θ ϑ ) ] θ ϑ
Fifthly, the gradient on θ ξ of L ( n , u , v , ϑ , r , h , ξ ) was calculated as follows:
L ( n , u , v , ϑ , r , h , ξ ) θ ξ = β 2 θ ξ δ n ( ξ ) log [ σ ( e h + r T θ ξ ) ] + [ 1 δ n ( ξ ) ] log [ 1 σ ( e h + r T θ ξ ) ] = β 2 δ n ( ξ ) [ 1 σ ( e h + r T θ ξ ) ] e h + r [ 1 δ n ( ξ ) ] σ ( e h + r T θ ξ ) e h + r = β 2 δ n ( ξ ) [ 1 σ ( e h + r T θ ξ ) ] [ 1 δ n ( ξ ) ] σ ( e h + r T θ ξ ) e h + r = β 2 [ δ n ( ξ ) σ ( e h + r T θ ξ ) ] e h + r
Consequently, the updating formula of θ ξ is denoted as follows:
θ ξ = θ ξ + α β 2 [ δ n ( ξ ) σ ( e h + r T θ ξ ) ] e h + r
Finally, the gradient on e h + r of L ( n , u , v , ϑ , r , h , ξ ) was calculated. The symmetry property between θ ξ and e h + r was utilized to get the gradient on e h + r :
L ( n , u , v , ϑ , r , h , ξ ) e h + r = β 2 [ δ n ( ξ ) σ ( e h + r T θ ξ ) ] θ ξ
where, e h + r = e h + e r and the vectors to update are e h and e r , so the updating of the gradient L ( n , u , v , ϑ , r , h , ξ ) e h + r was utilized on e h and e r respectively. The updating formulae of e h and e r are denoted as follows.
e h = e h + α β 2 ξ { { n } N E G ( n ) } [ δ n ( ξ ) σ ( e h + r T θ ξ ) ] θ ξ
e r = e r + α β 2 ξ { { n } N E G ( n ) } [ δ n ( ξ ) σ ( e h + r T θ ξ ) ] θ ξ
The stochastic gradient ascent method was used for optimization. More details are shown in Algorithm 1.
Algorithm 1: HRST
1 Input:
2   Hypernetwork H = ( V , E )
3   Embedding size d
4 Output:
5   Embedding matrix X R | V | × d
6 for node n in V do
7   initialize embedding vector v n R 1 × d
8   initialize parameter vector θ n R 1 × d
9   for node v in T n do
10     initialize parameter vector e v R 1 × d
11   end for
12   for hyperedge r in R n do
13    for node h in H r do
14      initialize parameter vector e h + r R 1 × d
15    end for
16   end for
17 end for
18 node sequences C = RandomWalk ( )
19 for ( n , c o n t e x t ( n ) ) in C do
20   update parameter vector according to Formula (13)
21   update embedding vector according to Formula (15)
22   update parameter vector according to Formula (17)
23   for node v in T n do
24     update parameter vector according to Formula (19)
25   end for
26   update parameter vector according to Formula (21)
27   for hyperedge r in R n do
28    for node h in H r do
29      update parameter vector according to Formula (23)
30      update parameter vector according to Formula (24)
31    end for
32   end for
33 end for
34 for i = 0 ;   i < | V | ;   i + + do
35    X i = v v i
36 end for
37 return X

5.5. Complexity Analysis

The time complexity of HRST was O ( | C | ( d s + 1 ) ( β 1 M s + β 2 M H r M R + 1 ) ) , where the time complexities of the topology-derived, the set constraint, and the translation constraint models are O ( | C | ( d s + 1 ) ) , O ( | C | ( d s + 1 ) M s ) , and O ( | C | ( d s + 1 ) M H r M R ) , respectively, where d s is a constant independent of the network size, and M s = m a x | T v 1 | ,   | T v 2 | ,     , | T v | V | | are the maxima of the set of the hyperedges associated with the node v i , M H r is the maxima of H r , which is the set of the nodes in the triplets with the relation r with the node v i , and M R = m a x | R v 1 | ,   | R v 2 | ,     , | R v | V | | are the maxima of the set of the relations associated with the node v i .

6. Experiments

6.1. Dataset

Four hypernetwork datasets were used to evaluate the effectiveness of HRST. Detailed dataset statistics are shown in Table 1.
Four datasets are shown as follows:
  • GPS [23] described a situation where a user partook in an activity in a location. The set of three-tuple <user, location, activity> was used to construct the hypernetwork.
  • MovieLens [24] described personal tag activities from MovieLens. The set of three-tuple <user, movie, tag> was used to construct the hypernetwork, where each movie had at least one genre.
  • Drug (http://www.fda.gov/Drugs/, accessed on 27 January 2020) described a situation where the user took drugs and had certain reactions that led to adverse events. The set of three-tuple <user, drug, reaction> was used to construct the hypernetwork.
  • wordnet [21] was composed of a set of triplets <head, relation, tail> extracted from WordNet3.0. The set of three-tuple <head, relationship, tail> was used to construct the hypernetwork.

6.2. Baseline Methods

DeepWalk. DeepWalk is a classical representation learning method to learn node representation vectors.
node2vec. node2vec preserves network neighborhoods of the nodes to learn node representation vectors.
LINE. LINE preserves both first- and second-order proximities to learn node representation vectors.
GraRep. GraRep captures global structure properties of a graph by k-step loss functions to learn node representation vectors.
HOPE. HOPE captures the higher-order proximity and asymmetric transitivity of a graph to learn node representation vectors.
SDNE. SDNE [25] utilizes first- and second-order proximities to characterize local and global network structures to learn node representation vectors.
HRSC. HRSC [11] incorporates the hyperedge sets associated with the nodes into the process of hypernetwork representation learning.
HRTC. HRTC models the interaction relationships among the nodes through the translation mechanism and incorporates the relationships among the nodes into the process of hypernetwork representation learning.
HRST. HRST incorporates the hyperedge sets associated with the nodes and interaction relationships among the nodes modeled through the translation mechanism into the process of hypernetwork representation learning.

6.3. Experimental Setting

Node classification and link prediction were used to evaluate the effectiveness of HRST. The vector dimension was set to 100, the number of the random walks to begin with every node to 10, and the length of the random walks to begin with every node to 40. Some datasets were randomly selected as the training set and the rest as the test set.

6.4. Node Classification

The multi-label classification tasks [1] were conducted on the MovieLens and wordnet datasets because labels are only on these two datasets. In addition, the nodes without labels on the two datasets were removed. An SVM [26] classifier was trained to calculate node classification accuracies.
From Table 2 and Table 3, the following observations were obtained as follows:
  • For the two datasets, the average value of the node classification accuracy of HRST was very close to those of HRSC and HRTC, and better than those of other baseline methods. For instance, for the average values of the node classification accuracy, HRST outperformed the other best baseline methods (e.g., DeepWalk) by about 1% on the two datasets. Meanwhile, the average values of the node classification accuracies of the remaining baseline methods were roughly weaker than those of HRST.
  • The average value of the node classification accuracy of GraRep ranked only second to those of HRST, HRSC, HRTC, and DeepWalk, and was very close to those of DeepWalk, because GraRep integrated the hyperedges to a certain extent into the process of network representation learning.
In a word, it was found that the quality of the node representation vectors learnt from HRST was better.

6.5. Link Prediction

In this subsection, the link prediction task was evaluated by the measure AUC [27]. From Table 4, Table 5, Table 6 and Table 7, the following observations were obtained as follows:
  • On the GPS and drug datasets, the average AUC value of HRST was very close to that of HRSC and superior to that of HRTC. On the wordnet dataset, the average AUC value of HRST was almost the same as those of HRSC and HRTC. On the MovieLens dataset, the average AUC values of HRST and HRTC were weaker than those of HRSC and DeepWalk. On the whole, HRST performed better than most baseline methods, which indicated the effectiveness of HRST.
  • HRST performed consistently at different training ratios compared with with other baseline methods, which demonstrated its feasibility and robustness.
  • HRST almost performed better than other baseline methods without incorporating hyperedges, which verified the assumption that it was good for link prediction to incorporate the hyperedges into the process of hypernetwork representation learning.
In a word, the above observations demonstrated that HRST can obtain high-quality node representation vectors.

6.6. Parameter Sensitivity

The harmonic factors β 1 and β 2 were used to counterweigh the contribution rate among the topology-derived, the set constraint, and the translation constraint models. The training ratio and β 2 were fixed to 50% and 0.5, respectively, and calculated node classification accuracies with a different β 1 , assuming that β 1 ranged from 0.1 to 0.9 on MovieLens and wordnet datasets. Figure 4 shows the comparisons of node classification accuracies with a different β 1 .
As shown in Figure 4, the node classification performance of HRST was not sensitive to the parameter β 1 and demonstrated the robustness of HRST, because the variation ranges of node classification accuracies with a different β 1 were all within 2.5%.
As for MovieLens and wordnet datasets, the best evaluated results in terms of node classifications were achieved at β 1 = 0.2 and β 1 = 0.7 , respectively.

7. Conclusions

Hypernetwork representation learning can explore the relationships among the nodes and find a universal method to solve practical problems, and it has a wide range of application scenarios, such as trend prediction, personalized recommendation, and other online applications. Therefore, we proposed a hypernetwork representation learning method with common constraints of the set and translation to effectively incorporate the hyperedges into the process of hypernetwork representation learning and regard the learning process of node representation vectors as a joint optimization problem, which was solved by means of the stochastic gradient ascend method. The experimental results demonstrated that our proposed method was almost entirely superior to other baseline methods. Although we carried out the research of the hypernetwork representation learning by means of a transformation strategy from the hypergraph to the graph and tried to incorporate the hyperedges into the process of the network representation learning, some hypernetwork structure information was still lost. Therefore, future research can be carried out regarding two aspects: firstly, continue to try to incorporate the hyperedges into network representation learning methods; secondly, the hypernetwork should no longer be transformed into the conventional network, so that the hyperedges are no longer decomposed, but regarded as a whole to study the hypernetwork representation learning.

Author Contributions

Conceptualization, Y.Z. and H.Z.; methodology, Y.Z. and H.Z.; software, Y.Z.; validation, Y.Z.; formal analysis, Y.Z.; investigation, Y.Z. and H.Z.; resources, Y.Z.; data curation, Y.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, Y.Z. and H.Z.; visualization, Y.Z.; supervision, Y.Z. and H.Z.; project administration, Y.Z. and H.Z.; funding acquisition, Y.Z., X.W., and J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant numbers 62166032, 62162053, and 62062059; by the Natural Science Foundation of Qinghai Province, grant numbers 2022-ZJ-961Q and 2022-ZJ-701; by the Project from Tsinghua University, grant number SKL-IOW-2020TC2004-01; and by the Open Project of State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, grant number 2020-ZZ-03.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained in the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ruan, Q.S.; Zhang, Y.R.; Zheng, Y.H.; Wang, Y.D.; Wu, Q.F.; Ma, T.Q.; Liu, X.L. Recommendation model based on a heterogeneous personalized spacey embedding method. Symmetry 2021, 13, 290. [Google Scholar] [CrossRef]
  2. Wang, M.H.; Qiu, L.L.; Wang, X.L. A survey on knowledge graph embeddings for link prediction. Symmetry 2021, 13, 485. [Google Scholar] [CrossRef]
  3. Li, Y.H.; Wang, J.Q.; Wang, X.J.; Zhao, Y.L.; Lu, X.H.; Liu, D.L. Community detection based on differential evolution using social spider optimization. Symmetry 2017, 9, 183. [Google Scholar] [CrossRef] [Green Version]
  4. Perozzi, B.; Al-Rfou, R.; Skiena, S. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD Internatonal Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
  5. Grover, A.; Leskovec, J. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
  6. Tang, J.; Qu, M.; Wang, M.Z.; Zhang, M.; Yan, J.; Mei, Q.Z. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
  7. Cao, S.S.; Lu, W.; Xu, Q.K. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, Melbourne, Australia, 19–23 October 2015; pp. 891–900. [Google Scholar]
  8. Ou, M.D.; Cui, P.; Pei, J.; Zhang, Z.W.; Zhu, W.W. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1105–1114. [Google Scholar]
  9. Tu, C.C.; Liu, H.; Liu, Z.Y.; Sun, M.S. CANE: Context-aware network embedding for relation modeling. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 1722–1731. [Google Scholar]
  10. Tu, C.C.; Zeng, X.K.; Wang, H.; Zhang, Z.Y.; Liu, Z.Y.; Sun, M.S.; Zhang, B.; Lin, L.Y. A unified framework for community detection and network representation learning. IEEE Trans. Knowl. Data Eng. 2019, 31, 1051–1065. [Google Scholar] [CrossRef] [Green Version]
  11. Zhu, Y.; Zhao, H.X. Hypernetwork representation learning with the set constraint. Appl. Sci. 2022, 12, 2650. [Google Scholar] [CrossRef]
  12. Agarwal, S.; Branson, K.; Belongie, S. Higher order learning with graphs. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25 June 2006; pp. 17–24. [Google Scholar]
  13. Huang, J.; Chen, C.; Ye, F.H.; Wu, J.J.; Zheng, Z.B.; Ling, G.H. Hyper2vec: Biased random walk for hyper-network embedding. In Proceedings of the 24th International Conference on Database Systems for Advanced Applications, Chiang Mai, Thailand, 23–25 April 2019; pp. 273–277. [Google Scholar]
  14. Huang, J.; Liu, X.; Song, Y.Q. Hyper-path-based representation learning for hyper-networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 449–458. [Google Scholar]
  15. Tu, K.; Cui, P.; Wang, X.; Wang, F.; Zhu, W.W. Structural deep embedding for hyper-networks. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 426–433. [Google Scholar]
  16. Zhou, D.Y.; Huang, J.Y.; Schölkopf, B. Learning with hypergraphs: Clustering, classification and embedding. In Proceedings of the 19th International Conference on Neural Information Processing Systems, Vancouver, Canada, 4–7 December 2006; pp. 1601–1608. [Google Scholar]
  17. Sharma, K.K.; Seal, A.; Herrera-Viedma, E.; Krejcar, O. An enhanced spectral clustering algorithm with s-distance. Symmetry 2021, 13, 596. [Google Scholar] [CrossRef]
  18. Bretto, A. Hypergraph Theory: An Introduction; Springer Press: Berlin, Germany, 2013; pp. 24–27. [Google Scholar]
  19. Zhang, R.C.; Zou, Y.S.; Ma, J. Hyper-SAGNN: A self-attention based graph neural network for hypergraphs. arXiv 2019, arXiv:1911.02613. [Google Scholar]
  20. Song, G.; Li, J.W.; Wang, Z. Occluded offline handwritten chinese character inpainting via generative adversarial network and self-attention mechanism. Neurocomputing 2020, 415, 146–156. [Google Scholar] [CrossRef]
  21. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 2787–2795. [Google Scholar]
  22. Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; Dean, J. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 3111–3119. [Google Scholar]
  23. Zheng, V.W.; Cao, B.; Zheng, Y.; Xie, X.; Yang, Q. Collaborative filtering meets mobile recommendation: A user-centered approach. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, GA, USA, 11–15 July 2010; pp. 236–241. [Google Scholar]
  24. Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 2015, 5, 19. [Google Scholar] [CrossRef]
  25. Wang, D.X.; Cui, P.; Zhu, W.W. Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1225–1234. [Google Scholar]
  26. Xu, J.L.; Han, J.W.; Nie, F.P.; Li, X.L. Multi-view scaling support vector machines for classification and feature selection. IEEE Trans. Knowl. Data Eng. 2020, 32, 1419–1430. [Google Scholar] [CrossRef]
  27. Wang, Y.G.; Huang, G.N.; Yang, J.J.; Lai, H.D.; Liu, S.; Chen, C.R.; Xu, W.C. Change point detection with mean shift based on AUC from symmetric sliding windows. Symmetry 2020, 12, 599. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Hypergraph and two-section graph: (a) hypergraph; (b) two-section graph.
Figure 1. Hypergraph and two-section graph: (a) hypergraph; (b) two-section graph.
Symmetry 14 01745 g001
Figure 2. TransE, where h + r t .
Figure 2. TransE, where h + r t .
Symmetry 14 01745 g002
Figure 3. HRST framework, where v i is the center node; the other nodes v i s , v i s + 1 , v i + s 1 , v i + s , etc. are contextual nodes of the center node v i , namely c o n t e x t ( v i ) ; T v i is the hyperedge set associated with the center node v i ; r is the interaction relation, namely the hyperedge; h is a node with the relation r with the center node v i ; and R v i is the hyperedge set associated with the center node v i .
Figure 3. HRST framework, where v i is the center node; the other nodes v i s , v i s + 1 , v i + s 1 , v i + s , etc. are contextual nodes of the center node v i , namely c o n t e x t ( v i ) ; T v i is the hyperedge set associated with the center node v i ; r is the interaction relation, namely the hyperedge; h is a node with the relation r with the center node v i ; and R v i is the hyperedge set associated with the center node v i .
Symmetry 14 01745 g003
Figure 4. Parameter sensitivity: (a) sensitivity on MovieLens; (b) sensitivity on wordnet.
Figure 4. Parameter sensitivity: (a) sensitivity on MovieLens; (b) sensitivity on wordnet.
Symmetry 14 01745 g004
Table 1. Dataset statistics.
Table 1. Dataset statistics.
DatasetNode Type#(V)#(E)
GPSuserlocationactivity1467051436
MovieLensusermovietag457168815305965
druguserdrugreaction41322211195
wordnetheadrelationshiptail1754715492174
Table 2. Node classification accuracies on MovieLens (%).
Table 2. Node classification accuracies on MovieLens (%).
MethodsTraining Ratios
10%20%30%40%50%60%70%80%90%AverageRank
DeepWalk48.0150.3551.4152.6052.5953.4753.5754.2354.0952.264
node2vec46.9349.2850.7751.5152.6252.5853.0453.4452.7151.436
LINE43.9345.4646.5247.2947.7048.1648.0249.0948.3447.178
GraRep47.7550.1151.1652.0152.1053.1553.3453.4353.2451.815
HOPE46.3348.5749.9550.6951.0651.0451.2952.5351.7950.367
SDNE41.7441.7942.3442.7343.3643.2743.8943.4342.8142.829
HRSC48.6050.8152.0253.1953.7354.1254.8354.9556.1453.153
HRTC48.7351.2652.7153.6254.3854.5755.0355.8256.3053.601
HRST48.4550.9052.4153.2353.8354.4554.9555.5855.8453.292
Table 3. Node classification accuracies on wordnet (%).
Table 3. Node classification accuracies on wordnet (%).
MethodsTraining ratios
10%20%30%40%50%60%70%80%90%AverageRank
DeepWalk29.9133.4434.5335.0535.7036.8037.9336.7139.0035.454
node2vec29.2732.2333.7134.5236.1736.0537.5337.6637.3034.946
LINE22.7724.1125.1124.9425.2325.5925.8726.6025.4425.078
GraRep32.5934.7434.6335.2135.3836.0535.1036.6337.7935.355
HOPE30.5333.6135.0235.9734.9035.1136.2136.2034.8434.717
SDNE21.9621.5722.0522.3723.2622.5923.6323.6025.3122.939
HRSC31.5433.9434.9836.8437.3538.0238.7840.2241.1036.972
HRTC31.3033.7935.5936.1836.9537.9438.1438.6340.0736.513
HRST30.8133.9236.0336.8537.3238.7339.2840.0841.4337.161
Table 4. AUC values on GPS.
Table 4. AUC values on GPS.
MethodsTraining Ratios
60%65%70%75%80%85%90%AverageRank
DeepWalk0.43080.42780.42050.45830.44180.49140.48310.45055
node2vec0.36600.36140.38080.39390.38340.39580.36490.37808
LINE0.45750.48290.47610.45620.44290.46630.45740.46284
GraRep0.38730.38050.38820.37650.38200.38570.38740.38397
HOPE0.38050.36760.34160.29710.27940.25180.23340.30739
SDNE0.32620.43710.43190.31570.43790.35270.45400.39366
HRSC0.75160.75620.74880.74490.72360.73250.72790.74081
HRTC0.68450.64280.64830.64030.62160.60050.58560.63193
HRST0.72000.72200.72410.72760.69290.70710.70060.71352
Table 5. AUC values on MovieLens.
Table 5. AUC values on MovieLens.
MethodsTraining Ratios
60%65%70%75%80%85%90%AverageRank
DeepWalk0.78450.81290.83010.84400.87290.88000.90250.84672
node2vec0.70780.73900.74180.76960.79390.80360.82960.76937
LINE0.82820.82420.82530.83200.83650.81720.82310.82665
GraRep0.72900.78330.79070.81210.82770.84810.85440.80656
HOPE0.68950.73330.72030.75220.77870.79860.80490.75398
SDNE0.40040.35110.34940.34060.34330.35980.41710.36609
HRSC0.87140.87060.86810.86440.87060.86510.85280.86611
HRTC0.84950.84970.83510.83250.83870.82810.83200.83793
HRST0.83770.84340.83510.82680.83410.83390.82970.83444
Table 6. AUC values on drug.
Table 6. AUC values on drug.
MethodsTraining Ratios
60%65%70%75%80%85%90%AverageRank
DeepWalk0.48520.49540.49340.45800.49010.46380.47130.47966
node2vec0.45000.45250.44900.45250.43290.47120.43450.44898
LINE0.47500.46720.46360.46250.47410.45230.47680.46747
GraRep0.50250.50890.48670.50510.55570.58350.53620.52554
HOPE0.50550.52690.49330.46900.49410.46680.42710.48325
SDNE0.29480.43100.44540.50500.51960.35360.38360.41909
HRSC0.78710.77740.78220.79200.77470.79830.80560.78821
HRTC0.71530.70710.71340.71340.68680.71080.72400.71013
HRST0.79140.76970.76850.79770.75860.77710.81050.78192
Table 7. AUC values on wordnet.
Table 7. AUC values on wordnet.
MethodsTraining Ratios
60%65%70%75%80%85%90%AverageRank
DeepWalk0.77800.81810.83050.83410.87080.87650.88800.84234
node2vec0.78070.82420.83090.82850.85190.85030.85950.83235
LINE0.80630.81840.80560.80910.80000.79380.79260.80376
GraRep0.76850.77420.78880.78060.79580.79720.77560.78307
HOPE0.69020.73140.74170.74030.76490.77630.77000.74508
SDNE0.37120.53480.47840.48240.42540.61590.48500.48479
HRSC0.89530.90790.90860.90360.90930.90450.90340.90471
HRTC0.90300.91150.90500.90160.90980.90270.89120.90352
HRST0.89380.90140.89100.88960.90260.89610.89450.89563
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhu, Y.; Zhao, H.; Huang, J.; Wang, X. Hypernetwork Representation Learning with Common Constraints of the Set and Translation. Symmetry 2022, 14, 1745. https://doi.org/10.3390/sym14081745

AMA Style

Zhu Y, Zhao H, Huang J, Wang X. Hypernetwork Representation Learning with Common Constraints of the Set and Translation. Symmetry. 2022; 14(8):1745. https://doi.org/10.3390/sym14081745

Chicago/Turabian Style

Zhu, Yu, Haixing Zhao, Jianqiang Huang, and Xiaoying Wang. 2022. "Hypernetwork Representation Learning with Common Constraints of the Set and Translation" Symmetry 14, no. 8: 1745. https://doi.org/10.3390/sym14081745

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop