Next Article in Journal
Lab-Tec@Home: A Cost-Effective Kit for Online Control Engineering Education
Next Article in Special Issue
SeAttE: An Embedding Model Based on Separating Attribute Space for Knowledge Graph Completion
Previous Article in Journal
Traffic Prediction of Space-Integrated Ground Information Network Using the GTCN Algorithm
Previous Article in Special Issue
Sentence Augmentation for Language Translation Using GPT-2
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

HRER: A New Bottom-Up Rule Learning for Knowledge Graph Completion

College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Electronics 2022, 11(6), 908; https://doi.org/10.3390/electronics11060908
Submission received: 16 February 2022 / Revised: 2 March 2022 / Accepted: 11 March 2022 / Published: 15 March 2022
(This article belongs to the Special Issue Advances in Data Mining and Knowledge Discovery)

Abstract

:
Knowledge graphs (KGs) are collections of structured facts, which have recently attracted growing attention. Although there are billions of triples in KGs, they are still incomplete. These incomplete knowledge bases will bring limitations to practical applications. Predicting new facts from the given knowledge graphs is an increasingly important area. We investigate the models based on logic rules in this paper. This paper proposes HRER, a new bottom-up rule learning for knowledge graph completion. First of all, inspired by the observation that the known information of KGs is incomplete and unbalanced, HRER modifies the indicators for screening based on the existing relation rule mining methods. The new metric H R R is more effective than traditional confidences in filtering Horn rules. Besides, motivated by the differences between the embedding-based methods and the methods based on logic rules, HRER proposes entity rules. The entity rules make up for the limited expression of Horn rules to some extent. HRER needs a few parameters to control the number of rules and can provide the explanation for prediction. Experiments show that HRER achieves the state-of-the-art across the standard link prediction datasets.

1. Introduction

Large scale knowledge graphs (KGs) such as Freebase [1], DBpedia [2], NELL [3], and YAGO [4], have achieved significant development in recent years. These KGs contain considerable facts stored in the form of triples (h, r, t), where h, r, t represents the heads, the relations, and the tails, respectively. KGs play a crucial role in intelligent question answering, search engines, and smart healthcare applications. However, KGs cannot exhaust all triples. Although there are billions of triples in KGs, they are still incomplete. The incomplete KGs will cause limitations to practical applications. For example, over 70% of people included in Freebase have no known place of birth, and 99% have no known ethnicity, which will significantly limit our search and answering. Therefore, knowledge graph reasoning, which infers new knowledge based on incomplete KGs, has received increasing attention.
Link prediction is a fundamental task of knowledge graph reasoning. Link prediction means that, given (?, r, t), predict the missing head entity based on the existing knowledge graphs or given (h, r, ?), predict the missing tail entity based on the existing knowledge graphs, where ? represents unknown entities. This paper divides the methods of link prediction into two types based on the characterization [5,6,7]: one is the embedding methods, which utilize latent features, and the other is the traditional methods based on logic rules which employ observed features.
Embedding methods are mainstream approaches [1,8,9,10,11,12,13,14,15]. These methods learn the embeddings of entities and relations simultaneously, then measure the rationality of triples through specific score functions between entities and relations. These methods can achieve better performances because they are free from the restrictions of rule representation. However, these methods have two shortcomings. The first weakness is the unexplainable inference, which means that the error prediction cannot be modified in practical application. Another defect is that these models have different parameters and are sensitive to specific parameters, making them hard to compare the pros and cons.
Logic-rule methods originated from inductive logic programming (ILP) in semiotics. Although these methods [5,16,17,18] do not perform well on standard datasets, practical applications are more inclined to adopt these methods for their interpretability. In practical applications [5,6,7], people can artificially modify the biased results of the interpretable models.
In this paper, our overarching interest is explainable inferences, i.e., the method of logic rules. However, the current logic-rule methods have the following two drawbacks.
The first one is that the models ignore the incomplete and unbalanced facts in KGs. Most traditional studies were designed based on the closed world assumption, such as Standard Confidence. However, the KGs are the open domain datasets, which may have more incomplete facts in open domain knowledge bases. Although some methods, such as AMIE [5] and RuleN [18], consider this incompleteness, their rule metrics are still inappropriate. These inappropriate metrics severely limit the number and quality of logic rules mined, which drives us to design more reasonable metrics.
Besides, the second one is that the limited characterizable rules restrict the performance. The current methods only mine the Horn rules, which are relatively simple and cannot describe the complicated relations of entities. The limited representation limits the model, thus we need to mine other non-Horn rules
In order to alleviate the above two drawbacks of the current logic-rule methods, this paper proposes HRER, a knowledge graph reasoning model based on the logic rule and entity rules. The specific strategies of HRER to these drawbacks are as follows.
First of all, since current filtering indicators ignore the incompleteness and biased distribution of information in KGs, we propose a new index—Horn rule reliability ( H R R ). The index H R R takes into account the incompleteness of the knowledge graph and the biased distribution of the facts, which can screen Horn rules more reasonably. Experiments on benchmark datasets show that the indicator H R R performs better in mining Horn rules.
Besides, to solve the problem that the limited form of logic rules restricts the performance, this paper first analyzes the differences between the logic rule methods and the embedding-based methods. Embedding-based methods learn to represent entities and relations simultaneously, which learn the equivalence of relations and the relevance of entities. Inspired by this observation, this paper proposes entity rules based on Horn rules. Entity rules can mine the inclusion and equivalence relations of entities on attributes. Experiments on benchmark datasets show that entity rules can effectively improve the performance of models based on Horn rules.
As shown in Figure 1, this is the architecture of HRER. HRER contains two main parts. The first part is relation rules (i.e., Horn rules) mining, and the second part involves mining entity rules. The upper part of Figure 1 shows the mining of Horn rules with Horn rule reliability and the lower part of Figure 1 shows the searching of entity rules. Then we achieve special weights of two types of rules based on the overall performance of training data. Finally, we apply the special weights to merge the two types of rules and perform link predictions.
In summary, the main contributions of this paper are as follows.
This paper proposes the new index—Horn rule reliability ( H R R ), which alleviates the problem caused by incompleteness and biased distribution. Experiments show that the Horn rule based on this metric H R R achieves state-of-the-art performance in the link prediction task.
This paper proposes the reasoning of entity rules, which makes up for insufficient representation of relation rules to some extent. Experiments show that the inference based on entity rules can improve link prediction by at least 2% on Hit@10.
HRER is explainable, providing the basis for the prediction. Unlike the embedding models, which are sensitive to parameters, HRER has only a few parameters for controlling the number of rules.
The rest of this paper is structured as follows. Section 2 presents a brief overview of related work. Section 3 introduces the problem formulation, including definitions and preliminaries. Section 4 is the central part of the model, which mainly explains the design of the Horn rule reliability index and the realization of entity rules. Section 5 describes the new evaluation metric and experiments. Finally, we summarize our findings along with the future directions in Section 6.

2. Related Work

This section describes related works and the critical differences between them. We divide knowledge graph models into two families: methods based on latent features and methods based on observed features.

2.1. Methods Based on Latent Features

Methods based on latent features (i.e., the embedding methods) belong to numerical reasoning. These methods first design the corresponding representation (including the representation of entities and relations, and the score function of triples) and train to make the matching score of the correct triples get the maximum value (i.e., make the matching function of the error triples achieve the minimum). Finally, the trained models are applied for link prediction. We divide embedding methods into three types according to the matching function: geometric models, tensor factorization models and deep learning models.
Geometric Models utilize the relations as the transform between heads and tails in latent spaces. TransE [1] directly selects the euclidean distance of the entities and the relation vectors to measure the matching degree of triples. However, TransE cannot describe the 1-N, N-1, N-N relationship well. TransH [19] revises the representation of entities and proposed that entities have different representations in different relations. TransR [12] thinks that using the same semantic space cannot adequately represent knowledge. TransR imports a mapping matrix to map entity vectors to different attribute spaces. RotatE [14] proposes the rotation of complex vectors to characterize the rules between relations better. Inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy, HAKE [20] maps entities into the polar coordinate system. HAKE can effectively model the semantic hierarchies in knowledge graphs.
Tensor Factorization Models define the dot product of tensors as the matching function. RESCAL [21] designs the relations as the full rank matrixs. On this basis, DistMult [22] proposes the mapping matrix to be a diagonal matrix. ANALOGY [23] improves the mapping matrix to a standard matrix. ComplEx [15] introduces complex-valued matrixes to represent the relations based on DistMult, which describes the asymmetric and antisense rules better. TuckER [13] imports tucker decomposition, and this model achieves state-of-the-art results on some datasets. SimplE [24] is a simple enhancement of CP to allow the two embeddings of each entity to be learned dependently. HolE [25] is a multiplicative model that is isomorphic to ComplEx [15]. Inspired by the recent success of automated machine learning (AutoML), AutoSF [26] proposes to automatically design scoring functions for distinct KGs by the AutoML techniques.
Deep Learning Models use deep neural networks to perform knowledge graph completion. ConvE [27] and ConvKB [28] employ convolutional neural networks to define score functions. CapsE [29] embeds entities and relations into one-dimensional vectors under the basic assumption that different embeddings encode homologous aspects in the same positions. CompGCN [30] utilizes graph convolutional networks to update the knowledge graph embedding. Neural Tensor Network (NTN) combines E-MLP with several bilinear parts. Nathani [31] proposes a novel attention-based feature embedding that captures both entity and relation features in any given entity’s neighborhood.
The representation of triples ranges from one-dimensional vectors to multi-dimensional tensors, and the matching function ranges from simple distance to hyperplane mapping. Together, these studies make improvements to have a better description of knowledge. Although such methods perform better on basic datasets, they are unexplainable and sensitive to parameters.

2.2. Methods Based on Observed Features

The methods based on observed features belong to symbolic reasoning. They mine relevant relation rules based on observable statistical features and then accomplish reasoning with these relation rules. Collectively, these methods generally apply association algorithms to mine Horn rules in the knowledge base, and there are also methods mining rules with experts.
Such methods originated from inductive logic programming. Sherlock [17] is a typically unsupervised method for mining logic rules. It extracts first-order Horn rules from network text and reasons with probabilistic graphical models (PGMs) Similar ILP methods are WARMR [16] and ALEPH [32]. However, these methods, which are not designed for open-domain knowledge bases, are not suitable for knowledge graph reasoning. Pang-Ning Tan [33] studies the association mining method, and most subsequent rules mining adopts this association method. PRA mines paths with a high probability of occurrence through random routing and incorporated path features as matrix features into machine learning. AMIE [5] is a typical method for mining association rules based on the partially complete assumption (PCA). RuleN [18] improves AMIE, which mines rules on the part of the knowledge base. RuleN integrates the method of path search and AMIE. AnyBURL [34,35] proposes an anytime bottom-up technique for learning logical rules from large knowledge graphs. AnyBURL [34,35] applies the learned rules to predict candidates in the context of knowledge graph completion.
Overall, these approaches mine the closed Horn rules existing in knowledge bases and use them to accomplish reasoning. These approaches perform poorly on standard datasets, but they are explainable.

3. Background

In this section, we introduce Horn rules and related concepts.
Let E denote the set of all entities and R the set of all relations present in KGs. In the following, we utilize the notation (h, r, t) (head entity, relation, tail entity) to identify a triple in KG, with h , t E , r R denoting the subject(head), the object(tail) and the relation between them, respectively.
Horn Rule. An atom is a fact that has variables at the subject or object position. A Horn rule consists of a head and a body, where the head is a single atom and the body is a set of atoms. The paper denotes rule with head r ( x , y ) and body { B 1 , , B n } :
B 1 B 2 B n r ( x , y )
where B 1 represents the atom r 1 ( x , z 1 ) , B i represents the atom r i ( z i 1 , z i ) and B n represents the atom r n ( z m , y ) . Horn Rules can be abbreviated as B r ( x , y ) . An instance of the rule is:
h a s C h i l d ( p , c ) i s C i t i z e n ( p , s ) i s C i t i z e n ( c , s )
In this paper, relation rules mined by HERE are Horn rules. We reason the head of the rule based on the body.
Support. The support of a rule quantifies the number of correct predictions, i.e., the number of distinct pairs of subjects and objects in the head. We calculate support as:
s u p p ( B r ( x , y ) ) : = z 1 , , z m : B r ( x , y )
Head Coverage. Support is an absolute quantitative indicator. The same number of supports in knowledge bases with different scales have different meanings, so the literature [5] designs the relative indicator Head Coverage. Head Coverage is the proportion of pairs from the head relation that are covered by the predictions of the rule:
h c ( B r ( x , y ) ) : = s u p p ( B r ( x , y ) ) # ( x , y ) : r ( x , y )
with # ( x , y ) : r ( x , y ) as an abbrivation for | { ( x , y ) : x , y E , r ( x , y ) } | .
Standard Confidence. The standard confidence measure takes all facts that are not in the KGs as negative evidence. Thus, the standard confidence of a rule is the ratio of its predictions that are in the KGs, i.e., the share of A in the set of predictions:
c o n f ( B r ( x , y ) ) : = s u p p ( B r ( x , y ) ) # ( x , y ) : z 1 , , z m : B
The standard confidence is blind to the distinction between “false” and “unknown”. Thus, it implements a closed world setting. It mainly describes the known data and penalizes rules that make a large number of predictions in the unknown region. Reasoning, in contrast, aims to maximize the number of true predictions that go beyond the current knowledge. We do not want to describe data but to predict data.
Partial Completeness. AMIE [5] proposes to generate negative evidence by the partial completeness assumption (PCA). This is the assumption that if r ( x , y ) i n K B t r u e for some x, y, then
y : r x , y K B t r u e N E W t r u e r x , y K B t r u e
In other words, AMIE assumes that if the database knows some r-attribute of x, then it knows all r-attributes of x. This assumption is valid for functional relations r, such as birth dates, capitals, etc. These usually contain either all r-values or none for a given entity. The assumption is also valid in the vast majority of cases for relations that are not functional, but that have high functionality. Even for other relations, the PCA is still reasonable for knowledge bases that have been extracted from a single source (such as DBpedia and YAGO). These usually contain either all r-values or none for a given entity.
PCA Confidence. AMIE [5] proposes the partial completeness assumption: if specific R attributes about entity x appear in the given knowledge base, then the model assumes that the knowledge base includes all R attributes of x. There is no R attribute of entity x in the triples needed to be inferred. AMIE changes the denominator as the set of facts we know correct, together with the facts which we assume are false.
Under the PCA, AMIE [5] normalizes the confidence not by the entire set of facts but by the set of facts which we know are true, together with the facts which we assume are false. If the head atom of the rule is r ( x , y ) , then this set is just the set of facts r x , y : r x , y K . Thanks to the FUN-Property, the PCA is always applied to the first argument of the head atom:
p c a c o n f ( B r ( x , y ) ) : = s u p p ( B r ( x , y ) ) # ( x , y ) : z 1 , , z m , y : B r ( x , y )
AMIE. AMIE [36] implements rule mining through a parallel search, which has high computing efficiency. AMIE [36] utilizes language bias to limit the search space, i.e., each atom in the rule is related to other atoms through the head entity or tail entity. AMIE defines that a rule is closed if every variable appears at least twice. AMIE only mines closed rules. This paper mines Horn rules through the rule mining algorithm in AMIE.

4. HRER Model

This section introduces HRER—a knowledge graph reasoning model based on Horn Rule and Entity Rule. Section 4.1 introduces the overall framework of the model. Section 4.2 introduces the implementation method of relation rules, which mainly explains the key indicator: Horn rule reliability. Section 4.3 introduces the implementation of entity rules.

4.1. Model Overview

Figure 1 shows the overall implementation process of HRER. HRER consists of mining the Horn rules and the entity rules in KGs. Then we combine the two rules with different weights and perform the link prediction.
For the fast searching, we adopt the rule mining algorithm in AMIE [36] to mine Horn rules upon the Horn rule reliability ( H R R ), which will be described in detail in Section 4.2, in the first part of the rule mining step. The second part is similar to the first part. We apply association algorithms in mining entity rules. Finally, we combine two types of rules by special weights based on the overall performance of the training dataset and perform link prediction on the test dataset.

4.2. Reasoning Based on Relation Rules

As the common methods based on logic rules [36], the first step is mining the Horn rules. This paper only mines closed Horn rules with the associated algorithms. For a better understanding, we take the following 2-hop closed Horn rule as the instantiation analysis:
M o t h e r O f ( p , c ) M a r r y T o ( p , s ) F a t h e r O f ( c , s )
Figure 2 shows the mechanism of the association search method. By traversing all relations, we filter triples crossing heads/tails to build a closed-loop, i.e., the closed Horn rule needed to mine. The specific implementation can be seen in AMIE [36].
The reliability of Horn rules mined by different facts is different. For the excavated Horn rules, we need to measure the reliability of the rules. Traditional methods utilize indicators such as Standard Confidence and PCA Confidence to rank the rules. However, the design of these indicators do not consider the uncertainty of incomplete information and the uneven distribution of facts in KGs. The following part focuses on the motivation and the design of the index—Horn Rule Reliability ( H R R ).
The incompleteness of the KGs is uncertain. The existing KG commonly satisfies that the number of known triples is more than that of unknown triples. However, for some special logic rules, the number of unknown triples may exceed that of known triples. As shown in Figure 3, the area of unknown triples exceeds known triples in the KGs. The traditional rule indexes do not consider this phenomenon, and they utilize the total number of triples in KGs. For example, Standard Confidence consideres all entity pairs involved in the relations, and the rationality of rules will reduce as the proportion of unknown knowledge increases.
Given the uncertainty of the incomplete information in the knowledge base, this paper does not consider all the information of the head relation (i.e., the total number of entity pairs involved in the head relation) in indicator design and only measures the number of entity pairs involved in the Horn rules. As the toy example shown in Figure 4, the number of Horn rules in the example is smaller than that of entity pairs with the same head relation in the rule. There is only one Horn rule, but the relation (i.e., f a t h e r O f ) involves four entity pairs. When measuring rule reliability, this paper only considers the triple (i.e., f a t h e r ( C , B ) ) involved in Horn rules and ignores the other three entity pairs of the relation (i.e., f a t h e r ).
Biased distribution of facts. The KGs are collections of triples extracted from the texts. The triples described in the text cannot be roughly uniformly distributed like datasets in other domains, such as signal processing. Even the standard dataset cannot guarantee the uniform distribution of the entities involved in each rule. The knowledge base constructed by the actual application cannot guarantee the balance even more. As shown in Figure 5, there may be different bodies pointing to the same head. In order to eliminate the imbalance of bodies in Horn rules, this paper only considers the number of head-to-tail pairs involved in the body. For instance, this paper only counts the types of head-to-tail pairs in Figure 5, i.e., the number of Horn rules is three.
Considering the uncertainty of the incomplete information and the biased distribution of facts, this paper designs a preliminary rule reliability index H R R i n i based on the Standard Confidence:
H R R i n i = s u p p ( B r ( x , y ) ) # ( x , y ) : B r ( x , y ) + # ( x , y ) : B r ( x , y ) s u p p ( B r ( x , y ) )
where # ( x , y ) : B r ( x , y ) as an abbrivation for | { ( x , y ) : ( x , y ) B r ( x , y ) } | , which refers to the number of all head-to-tail pairs that satisfy r ( x , ? ) , x X (where X refers to the head entity that appears in the Horn rule B r ( x , y ) ). # ( x , y ) : B r ( x , y ) as an abbrivation for | { ( x , y ) : ( x , y ) B r ( x , y ) } | , which refers to the number of all head-to-tail pairs that satisfy r ( ? , y ) , y Y (where Y refers to the tail entity that appears in the Horn rule B r ( x , y ) ). The value of H R R i n i changes on the interval ( 0 , 1 ] .
The credibility of rule reliability. In order to compare the relative reliability of Horn rules, the reliability index H R R i n i transforms the rules to the same scale. Nevertheless, the credibility of each rule’s reliability is different. For example, if the Horn rules of the same relation have different occurrences, then the credibility of distinct Horn rules is different. The following formula shows that the first rule appears twice, and the reliability is 100%. The second rule appears 95 times, and its reliability is 95%. However, the second rule is more credible than the first rule.
H R R i n i 1 = 2 2 = 100 %
H R R i n i 2 = 95 100 = 95 %
In order to measure the credibility of the rule reliability, the model designs a credibility index: the ratio of the number of Horn rules and the number of head-to-tail pairs involved in the relation.
R C = s u p p ( B r ( x , y ) ) # ( x , y ) : r ( x , y )
Horn rule reliability index. Combined with the credibility of the index, the final reliability index H R R is:
H R R = H R R i n i × R C
This indicator is a comparison of different Horn rules for the same relation. The exact relation contains the same number of head-to-tail pairs. Therefore, by removing the denominator term, the reliability index of Horn rules can be simplified as follows.
H R R = ( s u p p ( B r ( x , y ) ) ) 2 # ( x , y ) : B r ( x , y ) + # ( x , y ) : B r ( x , y ) s u p p ( B r ( x , y ) )
In the first part of the HRER model, we calculate H R R of the searched Horn rule. For the various Horn rules from the same head, we rank the tails predicted by Horn rules according to H R R .

4.3. Reasoning Based on Entity Rules

Embedding-based methods are the process of learning the representations of entities and relations by solving an optimization problem of maximizing the scores of correct triples while minimizing the scores of error triples. The embeddings in KGs contain the links between relations and the connections between entities, e.g., the similarity of entities.
Limited representations restrict the performance of logic rules. Such methods only search for closed Horn rules, accounting for its poor performance in standard datasets. Mining richer representations of relation-rule are our future research route. Unlike embedding-based methods, logic rules do not consider the links between entities and only mine relation rules, which accounts for its poor performance. In order to alleviate this problem, this paper proposes entity rules.
The entity rule discussed in this paper refers to inclusion, i.e., a rule that an entity contains another entity. As shown in Figure 6, the rule that an entity contains another entity means the subordinate relationship between the entities. When two entities contain each other, then the two entities are equal. Given conditions that entity A belongs to entity B, we can infer that A has the same attributes as B. Entity rules achieve link prediction in this way.
Entity rule mining. Entity rule mining is similar to relation rule mining. This paper mines entity rules based on the association features of “pseudo triples”. The entity rules mined in this paper are similar to the single-hop Horn rules, so this paper utilizes the method of mining single-hop Horn rules to search for entity rules in “pseudo triples”.
We swap the tail and relation in the triple to reconstruct a new “pseudo triple”. See the example in Figure 7 for the generation of “pseudo triple”, i.e., the actual relation is regarded as “tail entity”, and the actual tail entity is regarded as “relationship”. In the first step of the rule mining, “pseudo triples” are input into the relationship rule mining program. We implement the mining of entity rules by searching for single-hop closed Horn rules.
For the entities predicted by the entity rules, we rank them across the number of satisfied rules. Similar to the reliability of Horn rules H R R , we utilize the reliability of entity rules to sort the entity rules.
E R R = ( s u p p ( B t ( h , r ) ) ) 2 # ( h , r ) : B t ( h , r ) + # ( h , r ) : B t ( h , r ) s u p p ( B t ( h , r ) )
where h , r , t represent the heads, relation and tail in a triple, respectively. Where # ( h , r ) : B t ( h , r ) is an abbrivation for | { ( h , r ) : ( h , r ) B t ( h , r ) } | , which refers to the number of all head-to-tail pairs that satisfy t ( h , ? ) . # ( h , r ) : B t ( h , r ) is an abbrivation for | { ( h , r ) : ( h , r ) B t ( h , r ) } | , which refers to the number of all head-to-tail pairs that satisfy r ( ? , y ) .

5. Experiments and Results

This section verifies the performance of HRER on link prediction tasks through experiments.

5.1. Datasets and Evaluations

Datasets for benchmarking link prediction should be obtained by sampling real-world KGs. We evaluate HRER using four standard link prediction datasets generated from actual scenarios (see Table 1). We can access four datasets through this link. (https://github.com/ibalazevic/TuckER, accessed on 20 January 2022).
  • FB15k [1]. This dataset is a subset of Freebase, a large, growing knowledge base of the real world.
  • FB15k-237 [37]. This dataset is obtained by eliminating the inverse and equal relations in FB15K, making it more difficult for simple models to do well.
  • WN18 [1]. This dataset is a subset of WordNet, a hierarchical database containing lexical relations between words.
  • WN18RR [27]. This dataset is achieved by excluding inverse and equal relations in WN18.
Evaluation Settings. We use evaluation metrics standard across the link prediction literature: mean reciprocal rank (MRR) and Hits@k, k { 1 , 3 , 10 } . Mean reciprocal rank is the average of the inverse of the mean rank assigned to the true triple overall candidate triples. Hits@k measures the percentage of times a true triple is ranked within the top k candidate triples. We evaluate the performance of link prediction in the filtered setting [1], i.e., all known true triples are removed from the candidate set except for the current test triple. In both settings, higher MRR or higher Hits@1/3/10 indicate better performance.

5.2. Parameter Settings

HRER contains two parts: mining Horn rule and entity rule (see Figure 1). The AMIE algorithm uses parallel computing to accomplish rule mining, which significantly improves the efficiency of rule mining. Therefore, this paper applies the AMIE algorithm to mining rules, calculates the corresponding rule reliability H R R and E R R , and performs link prediction with the combination of two types of rules.
Unlike the embedding-based method with many parameter settings, HRER only sets one parameter in rule mining to control the number of rules. This paper only uses H R R between [0 and 1] in the rule mining step. Modifying this indicator will determine the number of searched rules. When we set the indicator to 0, all Horn rules are mined; if we take the value as 1, the algorithm will dig a few Horn rules. Generally speaking, the rule with lower rule reliability is less useful for link prediction. If the AMIE algorithm searches out all Horn rules, too many rules will affect the efficiency of link predictions. This paper sets H R R to 0.05 and E R R to 0.01 to control the number of rules.

5.3. Link Prediction Results

The experiments on link prediction mainly compare the following methods: TransE (i.e., the primary embedding method), STransE, CrossE, TorusE, RotatE, TuckER, DisMult, ComplEx, ANALOGY, SimplE, HolE, ConvE, ConvKB, ConvR, CapsE, RSN and AMIE. The parameters used by AMIE are the default parameters: Head Coverage equals 0.01, and PCA conf equals 0.1. The experiment first performs the link prediction for the first part and the second part of the HRER model separately, then the two parts are merged to obtain the final result of HRER.
Result Comparison. As can be seen from Table 2, the performance of the Horn rules based on H R R designed in this paper exceed PCA conf in AMIE. Besides, entity rules obtain a relative improvement of 0.22% and 2.32% in MRR and Hits@10, averaged on FB15k, FB15k-237, WN18 and WN18RR. Overall, HRER outperforms previous state-of-the-art models on all metrics across two datasets (apart from FB15K-237, where TuckER does better).
HRER has only one parameter to control the number of Horn rules. If HRER does not set this parameter, performance on link prediction may be better. However, to reduce mining time by eliminating the redundancy of rules, HRER set the threshold to limit the number of rules. Since the model in this paper only mines closed Horn rules and simple entity rules, HRER is limited by the form of rule representation. Therefore, HRER cannot achieve the best Hit@10 on FB15K-237.
Case Study and Interpretability. The most significant feature of HRER is interpretability. For all prediction results of HRER, the model can provide the basis. Table 3 shows part of the reasoning basis for link prediction on FB15K-237.
As can be seen from Table 3, the reasoning basis is consistent with our perception. For example, we want to reason the releasing area of a specific movie A. It is known that the release area of A is Region C, and the adjacent region of Region C is Region B, so the model infers that the movie A will also be shown in Region B.

6. Conclusions

This paper proposes a new bottom-up rule learning model for link prediction—HRER. The major novelty of HRER is as follows. First, HRER designs a new Horn rule filtering index H R R to measure the reliability of Horn rules. Furthermore, HRER proposes entity rules for the limitation of rule expression. In addition, HRER has better interpretability and can give a better explanation for the inference. Finally, unlike the embedding-based method, HRER needs a minimal parameter to control the number of rules. Experiments on the standard dataset show that HRER achieves state-of-the-art performances. In the future, our research will no longer be restricted to closed logic rules, and we will study more representations of rules. Recently, graph neural networks have achieved good performance on link prediction. In the future, we also plan to leverage the graph attention framework to capture higher-order relations between entities.

Author Contributions

Conceptualization, Z.L. and J.Y.; validation, K.H.; formal analysis, Z.L.; investigation, H.L.; resources, L.C.; writing—original draft preparation, L.Q. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Anhui Provincial Natural Science Foundation of FUNDER grant number No. 1908085MF202 and and Independent Scientific Research Program of National University of Defense Science and Technology of FUNDER grant number No. ZK18-03-14.

Informed Consent Statement

Not applicable.

Data Availability Statement

MDPI Research Data Policies at https://github.com/ibalazevic/TuckER, (accessed on 20 January 2022).

Acknowledgments

This work was partially supported by the Anhui Provincial Natural Science Foundation (No. 1908085MF202) and Independent Scientific Research Program of National University of Defense Science and Technology (No. ZK18-03-14).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NE, USA, 5–10 December 2013; pp. 2787–2795. [Google Scholar]
  2. Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z. Dbpedia: A nucleus for a web of open data. In The Semantic Web; Springer: Berlin/Heidelberg, Germany, 2007; pp. 722–735. [Google Scholar]
  3. Carlson, A.; Betteridge, J.; Kisiel, B.; Settles, B.; Hruschka, E.R.; Mitchell, T.M. Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, Atlanta, GA, USA, 11–15 July 2010. [Google Scholar]
  4. Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A large ontology from wikipedia and wordnet. J. Web Semant. 2008, 6, 203–217. [Google Scholar] [CrossRef] [Green Version]
  5. Galárraga, L.A.; Teflioudi, C.; Hose, K.; Suchanek, F. AMIE: Association rule mining under incomplete evidence in ontological knowledge bases. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 413–422. [Google Scholar]
  6. Mahdy, A.; Lotfy, K.; Ismail, E.; El-Bary, A.; Ahmed, M.; El-Dahdouh, A. Analytical solutions of time-fractional heat order for a magneto-photothermal semiconductor medium with Thomson effects and initial stress. Results Phys. 2020, 18, 103174. [Google Scholar] [CrossRef]
  7. Mahdy, A.M. Numerical solutions for solving model time-fractional Fokker–Planck equation. Numer. Methods Partial Differ. Equ. 2021, 37, 1120–1135. [Google Scholar] [CrossRef]
  8. Gao, L.; Zhu, H.; Zhuo, H.H.; Xu, J. Dual Quaternion Embeddings for Link Prediction. Appl. Sci. 2021, 11, 5572. [Google Scholar] [CrossRef]
  9. Wang, P.; Zhou, J.; Liu, Y.; Zhou, X. TransET: Knowledge Graph Embedding with Entity Types. Electronics 2021, 10, 1407. [Google Scholar] [CrossRef]
  10. Wang, M.; Qiu, L.; Wang, X. A Survey on Knowledge Graph Embeddings for Link Prediction. Symmetry 2021, 13, 485. [Google Scholar] [CrossRef]
  11. Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; Volume 1, pp. 687–696. [Google Scholar]
  12. Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
  13. Balažević, I.; Allen, C.; Hospedales, T.M. Tucker: Tensor factorization for knowledge graph completion. arXiv 2019, arXiv:1901.09590. [Google Scholar]
  14. Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. RotatE: Knowledge graph embedding by relational rotation in complex space. arXiv 2019, arXiv:1902.10197. [Google Scholar]
  15. Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex Embeddings for Simple Link Prediction. In Proceedings of the ICML, New York, NY, USA, 20–22 June 2016. [Google Scholar]
  16. Goethals, B.; Van den Bussche, J. Relational association rules: Getting Warmer. In Proceedings of the Pattern Detection and Discovery, London, UK, 16–19 September 2002; pp. 125–139. [Google Scholar]
  17. Schoenmackers, S.; Davis, J.; Etzioni, O.; Weld, D. Learning first-order horn clauses from web text. In Proceedings of the 2010 Conference on Empirical Methods on Natural Language Processing, Cambridge, MA, USA, 9–11 October 2010; pp. 1088–1098. [Google Scholar]
  18. Meilicke, C.; Fink, M.; Wang, Y.; Ruffinelli, D.; Gemulla, R.; Stuckenschmidt, H. Fine-grained evaluation of rule-and embedding-based systems for knowledge graph completion. In Proceedings of the International Semantic Web Conference, Monterey, CA, USA, 8–12 October 2018; pp. 3–20. [Google Scholar]
  19. Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the AAAI, Quebec City, QC, Canada, 27–31 July 2014; Volume 14, pp. 1112–1119. [Google Scholar]
  20. Zhang, Z.; Cai, J.; Zhang, Y.; Wang, J. Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 3065–3072. [Google Scholar]
  21. Maximilian, N.; Volker, T.; Hans-Peter, K. A Three-Way Model for Collective Learning on Multi-Relational Data. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, WA, USA, 28 June–2 July 2011; pp. 809–816. [Google Scholar]
  22. Yang, B.; Yih, W.t.; He, X.; Gao, J.; Deng, L. Embedding entities and relations for learning and inference in knowledge bases. arXiv 2014, arXiv:1412.6575. [Google Scholar]
  23. Liu, H.; Wu, Y.; Yang, Y. Analogical inference for multi-relational embeddings. arXiv 2017, arXiv:1705.02426. [Google Scholar]
  24. Kazemi, S.M.; Poole, D. SimplE Embedding for Link Prediction in Knowledge Graphs. In Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Montréal, QC, Canada, 3–8 December 2018; pp. 4289–4300. [Google Scholar]
  25. Nickel, M.; Rosasco, L.; Poggio, T.A. Holographic Embeddings of Knowledge Graphs. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Schuurmans, D., Wellman, M.P., Eds.; AAAI Press: Palo Alto, CA, USA, 2016; pp. 1955–1961. [Google Scholar]
  26. Zhang, Y.; Yao, Q.; Dai, W.; Chen, L. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. In Proceedings of the 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, 20–24 April 2020; pp. 433–444. [Google Scholar] [CrossRef]
  27. Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2D knowledge graph embeddings. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
  28. Nguyen, D.Q.; Nguyen, T.; Nguyen, D.Q.; Phung, D.Q. A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network. arXiv 2018, arXiv:1712.02121. [Google Scholar]
  29. Nguyen, D.Q.; Vu, T.; Nguyen, T.; Nguyen, D.Q.; Phung, D.Q. A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization. arXiv 2019, arXiv:1808.04122. [Google Scholar]
  30. Vashishth, S.; Sanyal, S.; Nitin, V.; Talukdar, P. Composition-based Multi-Relational Graph Convolutional Networks. arXiv 2020, arXiv:1911.03082. [Google Scholar]
  31. Nathani, D.; Chauhan, J.; Sharma, C.; Kaul, M. Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, 28 July–2 August 2019; Korhonen, A., Traum, D.R., Màrquez, L., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; Volume 1, pp. 4710–4723. [Google Scholar] [CrossRef]
  32. Muggleton, S. Inverse entailment and Progol. New Gener. Comput. 1995, 13, 245–286. [Google Scholar] [CrossRef]
  33. Tan, P.N.; Kumar, V.; Srivastava, J. Selecting the right interestingness measure for association patterns. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, 23–26 July 2002; pp. 32–41. [Google Scholar]
  34. Meilicke, C.; Chekol, M.W.; Ruffinelli, D.; Stuckenschmidt, H. Anytime Bottom-Up Rule Learning for Knowledge Graph Completion. In Proceedings of the IJCAI, Macao, China, 10–16 August 2019; pp. 3137–3143. [Google Scholar]
  35. Meilicke, C.; Chekol, M.W.; Fink, M.; Stuckenschmidt, H. Reinforced Anytime Bottom Up Rule Learning for Knowledge Graph Completion. arXiv 2020, arXiv:2004.04412. [Google Scholar]
  36. Galárraga, L.; Teflioudi, C.; Hose, K.; Suchanek, F.M. Fast rule mining in ontological knowledge bases with AMIE+. VLDB J. 2015, 24, 707–730. [Google Scholar] [CrossRef] [Green Version]
  37. Toutanova, K.; Chen, D.; Pantel, P.; Poon, H.; Choudhury, P.; Gamon, M. Representing text for joint embedding of text and knowledge bases. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1499–1509. [Google Scholar]
Figure 1. Model Architecture.
Figure 1. Model Architecture.
Electronics 11 00908 g001
Figure 2. This is a toy example of mining closed Horn rules.
Figure 2. This is a toy example of mining closed Horn rules.
Electronics 11 00908 g002
Figure 3. There may be more unknown triples in the knowledge base.
Figure 3. There may be more unknown triples in the knowledge base.
Electronics 11 00908 g003
Figure 4. The number of Horn rules is much lower than the number of head relationships.
Figure 4. The number of Horn rules is much lower than the number of head relationships.
Electronics 11 00908 g004
Figure 5. There may be different bodies pointing to the same head.
Figure 5. There may be different bodies pointing to the same head.
Electronics 11 00908 g005
Figure 6. This is the inclusion of properties between entities.
Figure 6. This is the inclusion of properties between entities.
Electronics 11 00908 g006
Figure 7. This is an example for the conversion of triples to pseudo triples.
Figure 7. This is an example for the conversion of triples to pseudo triples.
Electronics 11 00908 g007
Table 1. Dataset statistics.
Table 1. Dataset statistics.
Dataset#Entities#Relations#Triples#Testset
FB15K14,9511345483,14259,071
FB15K-23714,541237272,11520,466
WN1840,94318141,4425000
Wn18RR40,5991186,8353134
Table 2. Link prediction results on FB15k, FB15k-237, WN18 and WN18RR.
Table 2. Link prediction results on FB15k, FB15k-237, WN18 and WN18RR.
FB15KFB15K-237WN18Wn18RR
FHit@1/%FHit@10/%FMRRFHit@1/%FHit@10/%FMRRFHit@1/%FHit@10/%FMRRFHit@1/%FHit@10/%FMRR
TransE49.3684.730.62821.7249.680.31540.5694.870.6462.7049.520.206
STransE39.7779.600.54322.4849.560.31543.1293.450.65610.1342.210.226
CrossE60.0886.230.70221.2147.050.29873.2895.030.83438.0744.990.405
TorusE68.8583.980.74619.6244.710.28194.3395.440.94742.6853.350.463
RotatE73.9388.100.79123.8353.060.33694.3096.00.94942.8057.150.476
DistMult73.6186.320.78422.4449.010.31372.6094.610.82439.6850.220.433
ComplEx81.5690.530.84825.7252.970.34994.5395.500.94942.5552.120.458
ANALOGY65.5983.740.72612.5935.380.20292.6194.420.93435.8238.000.366
SimplE66.1383.630.72610.0334.350.17993.2594.580.93838.2742.650.398
HolE75.8586.780.80021.3747.640.30393.1194.940.93840.2848.790.432
TuckER72.8988.880.78825.9053.610.35294.6495.800.95142.9551.400.459
ConvE59.4684.940.68821.9047.620.30593.8995.680.94538.9950.750.427
ConvKB11.4440.830.21113.9841.460.23052.8994.890.7095.6352.500.249
ConvR70.5788.550.77325.5652.630.34694.5695.850.95043.7352.680.467
CapsE1.9321.780.0877.3435.600.16084.5595.080.89033.6955.980.415
RSN72.3487.010.77719.8444.440.28091.2395.100.92834.5948.340.395
AMIE67.4088.150.79724.4747.790.30887.2194.030.93131.0535.600.357
Horn Rule84.2789.010.86125.1048.220.31293.4795.320.94144.1650.980.465
Ent Rule13.8217.370.14210.7520.030.11315.8120.740.17110.0811.870.107
HRER84.8791.090.87125.3948.980.32897.5297.870.97646.9453.320.489
Table 3. Horn rules mined in FB237.
Table 3. Horn rules mined in FB237.
Dataset#Entities#Relations
Rule 1head(X, /sports/sports_team/roster./American_football/football_roster_position/position, Y)
body(X, /sports/sports_position/players./sports/sports_team_roster/team, Y)
Rule 2head(X, /award/award_category/winners./award/award_honor/ceremony/football_roster_position/position, Y)
body(X, /award/award_category/category_of, Z)
(Z, /time/event/instance_of_recurring_event, Y)
Rule 3head(X,/film/film/release_date_s./film/film_regional_release_date/film_release_region, Y)
body(X, /film/film/release_date_s./film/film_regional_release_date/film_release_region, Z)
(Z, /location/location/adjoin_s./location/adjoining_relationship/adjoins, Y)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Liang, Z.; Yang, J.; Liu, H.; Huang, K.; Cui, L.; Qu, L.; Li, X. HRER: A New Bottom-Up Rule Learning for Knowledge Graph Completion. Electronics 2022, 11, 908. https://doi.org/10.3390/electronics11060908

AMA Style

Liang Z, Yang J, Liu H, Huang K, Cui L, Qu L, Li X. HRER: A New Bottom-Up Rule Learning for Knowledge Graph Completion. Electronics. 2022; 11(6):908. https://doi.org/10.3390/electronics11060908

Chicago/Turabian Style

Liang, Zongwei, Junan Yang, Hui Liu, Keju Huang, Lin Cui, Lingzhi Qu, and Xiang Li. 2022. "HRER: A New Bottom-Up Rule Learning for Knowledge Graph Completion" Electronics 11, no. 6: 908. https://doi.org/10.3390/electronics11060908

APA Style

Liang, Z., Yang, J., Liu, H., Huang, K., Cui, L., Qu, L., & Li, X. (2022). HRER: A New Bottom-Up Rule Learning for Knowledge Graph Completion. Electronics, 11(6), 908. https://doi.org/10.3390/electronics11060908

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop