Advancements in Complex Knowledge Graph Question Answering: A Survey

Song, Yiqing; Li, Wenfa; Dai, Guiren; Shang, Xinna

doi:10.3390/electronics12214395

Open AccessReview

Advancements in Complex Knowledge Graph Question Answering: A Survey

¹

Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China

²

College of Robotics, Beijing Union University, Beijing 100101, China

³

Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(21), 4395; https://doi.org/10.3390/electronics12214395

Submission received: 17 September 2023 / Revised: 13 October 2023 / Accepted: 19 October 2023 / Published: 24 October 2023

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Complex Question Answering over Knowledge Graph (C-KGQA) seeks to solve complex questions using knowledge graphs. Currently, KGQA systems achieve great success in answering simple questions, while complex questions still present challenging issues. As a result, an increasing number of novel methods have been proposed to remedy this challenge. In this survey, we proposed two mainstream categories of methods for C-KGQA, which are divided according to their use for knowledge graph representation and construction, namely, graph metric (GM)-Based Methods and graph neural network (GNN)-based methods. Additionally, we also acknowledge the influence of ChatGPT, which has prompted further research into utilizing knowledge graphs as a knowledge source to assist in answering complex questions. We also introduced methods based on pre-trained models and knowledge graph joint reasoning. Furthermore, we have compiled research achievements from the past three years to make it easier for researchers with similar interests to obtain state-of-the-art research. Finally, we discussed the resources and evaluation methods for tackling C-KGQA tasks and summarized several research prospects in this field.

Keywords:

knowledge graph question answering; question answering; complex question; survey

1. Introduction

A knowledge graph (KG) is a structured database for knowledge management. It describes the concepts and entities and their relationships in the real world and is used to extract knowledge in specific fields. Users retrieve questions based on the triple format (subject, relation, object) [1] or text form of knowledge in KGs. Knowledge graph question answering (KGQA) can be utilized to enhance the presentation of search engine results and improve the search experience. By incorporating knowledge graphs, search engines can offer more precise and relevant search outcomes, thereby assisting users in swiftly locating the desired information. Furthermore, employing a KGQA system for semantic search enables the provision of more accurate and personalized search results and recommendations. Various industrial sectors also employ domain-specific expert knowledge graphs [2,3,4,5] to manage data and guide decision-making processes through knowledge graph question answering. Large-scale KGs include Freebase [6], DBpedia [7], Wikidata [8], and YAGO [9], and the tasks of KGQA can be divided into simple questions and complex questions according to the difficulty of the question. A simple question has a direct answer that only requires the knowledge of a single triple, and the answer requires a few facts in KG, such as a subject–relation–object triple [10]. Complex questions need more knowledge than simple questions, and it is necessary to utilize more sophisticated query operations, such as using indirect relationships between entities, multi-hop relationships, and qualitative and quantitative constraints, to collect facts in the knowledge graph. There are still significant difficulties in responding to complicated queries, despite the gradual maturation of the study on simple knowledge graph question answering.

The challenges in the field of complex question answering over KGs (C-KGQA) can be roughly divided into two categories of complexity: questions with constraints and questions involving multiple hops of relations. Researchers also try to solve constraints and multi-hop properties over KGs synchronously (see [11]). A multi-hop question means that an answer is not simply related to a single piece of knowledge but may involve knowledge between multiple triples to reason out the answer. The question “What was the first novel of the 2022 Nobel Prize winner in Literature?” The KG contains the relevant triples, but a single triple cannot infer the answer. According to the topic entity “Nobel prize” in the question find the entity “Anne Herno”; then, another triple related to Anne Herno’s novel is found, and the final answer “Les Armoires vides” can be obtained. The corresponding process can be reflected in Figure 1. The model must deal with multiple entities and relationships from the KG, and the entities detected in these questions need to be linked to the KG triples. The reasoning process of the answer undergoes an information hop between two triples. Multi-hop questions are those that need to hop between two or more triples to submit messages. Constrained questions refer to questions that have certain constraints, which can be divided into time type (“... years ago”), sequential type (“1st... ”), quantitative type (“more than five times...”), and others. In Figure 1, we can lock the next triple with the time constraint “2022” and the quantifier constraint “first part”. These constraints modify the subject of a natural language question and, thus, change the final answer [12].

KGs, as a structured representation of relationships between entities and concepts in the real world, provide vital support for KGQA tasks [13]. We believe that solving the question of C-KGQA can be primarily divided into two main tasks. One is to better represent natural language questions (NLQs) and extract the true intent and key information relevant to reasoning [14]. The other is to effectively represent the graph, learn the knowledge represented in the graph, and accurately answer the questions. Regarding the related surveys, we observe Wu et al. [15] initially divided KGQA methods into semantic parsing (SP)-based methods and information retrieval (IR)-based methods, and this classification framework is still prevalent. They also provided an overview of the paradigms and advancements in these two approaches. However, due to the limitation of the publication year, current methods have further evolved. It is worth noting that they highlighted the challenge of the semantic gap in this field, which remains a prominent research trend today. Jr et al. [12] employed a systematic mapping methodology to provide an overview of C-KGQA. They offered a methodical mapping of the literature to identify the most popular venues, domains, and KGs utilized in the literature. The literature they compiled is available on a GitHub repository, providing a valuable resource for researchers seeking to explore relevant methodological studies. Furthermore, they presented a comprehensive overview of the methodologies, datasets, and metrics employed in the C-KGQA field while also addressing the challenges associated with the existing datasets. Lan et al. [10] conducted a comprehensive review of the recent advances in C-KGQA. They extensively discussed two mainstream categories of methods, provided a deep comparison of core modules, and presented a unified paradigm of neural–symbolic reasoning. Moreover, they employed a fine-grained classification approach to offer a nuanced description of the challenges associated with these two methods and solutions in recent work. Furthermore, they explored the significant impact of cutting-edge pre-training language models (PLMs) on C-KGQA. Zhang et al. [16] examined how the ability to answer complicated factual questions has developed across different data sources. They identify the features that these strategies have in common and put them all under the analysis–extend–reason framework. Specifically, they represented C-KGQA as answering complex factual questions in structured data sources and divided them into semantic parsing (SP)-based methods and graph-based methods, which diverges from our classification approach. They offered valuable insights into the types of complex questions that each method is best suited for. Subsequently, they presented the framework of the two methods individually. However, their analysis of the engagement of pre-trained language models (PLMs) in C-KGQA was concise, with a focus solely on their role as generators.

Following a brief discussion, our work is distinct in that we divide the methods into three categories based on various subtasks and give an overview of a module for each category. Additionally, we present the research ideas behind the latest methods to facilitate researchers with similar interests in accessing the most recent research achievements. Finally, we review the resources and evaluation methods for tackling C-KGQA tasks and summarize the research prospects in the field.

2. Preliminary

In this section, we will introduce examples of graph representations and provide explanations of certain characters that may be encountered, thus facilitating the comprehension of subsequent sections for researchers.

2.1. Knowledge Graph

KG aims to describe the concepts and entities, as well as their relationships, in the real world [17]. As an application of the semantic web, it is divided into three components: concepts, entities, and their relationship triples. Furthermore, use the “Resource Description Framework (RDF)” to describe. The basic data model of RDF includes three object types: (1) Resource: objects that can be represented by RDF are called resources, including entities, events, and concepts on the Internet. (2) Relation: it mainly describes the characteristics of resources and the relationship between resources. (3) Statements: a statement contains three parts, which are represented as

〈 s u b j e c t, r e l a t i o n, o b j e c t 〉

, and usually called SPO triples [18]. In Figure 2, we show a KG instance in which each entity has a unique ID, and the relationship type of the person entity “Jackie Chan” is Person.marriage.spouses, which are associated with the named entity “FungGiu Lam”, which can be stored as

〈 J a c k i e C h a n, p e o p l e . m a r r i a g e . s p o u s e, F u n g G i u L a m 〉

in KG as an SPO triple.

Large-scale open KGs (such as Freebase [6] and DBPedia [7]) are published by the rules of the Resource Description Framework(RDF), which support structured graph query languages to retrieve and manipulate triples [10]. To facilitate query large-scale knowledge graphs, graph query language SPARQL [19] is often used to retrieve and manipulate data stored in the knowledge graph. Similar graph query languages, such as Cypher [20] and Lambda-DCS [21], are also designed to query related knowledge graphs. So, how to unify the query language to improve the reusability of the KGQA model is also an obvious challenge. Specific KGs are designed for specific research. For example, the language knowledge graph WordNet [22], ConceptNet [23], and the open version OpenCyc, based on the largest general knowledge base Cyc, are the main knowledge graphs. HowNet [24] is a typical language cognitive knowledge graph. In addition, there are some knowledge graphs constructed for specific research, such as military or medical knowledge graphs.

2.2. Task Formulation

The knowledge is stored in the graph in the form of quadruples

(s, r, o)

, and we denote a graph as

G = {〈 s, r, o 〉 | s, o \in E, r \in R}

, and for a natural language question, the token representation of the processed natural language question is represented as

q = {w_{1}, w_{2}, . . ., w_{m}}

, and the predicted answer is represented as

\hat{A_{q}}

, usually assuming that the correct answer

A_{q}

to the question will be found in G. The knowledge graph question-answering dataset

{(q, A_{q})}

usually contains some question–answer pairs to train the model.

For each triplet used in the next sections, we define the notation and problem definition in Table 1.

3. Materials and Methods

We summarize the research on C-KGQA in recent years. The methods can be roughly divided into graph metric model (GMM)-based methods, graph neural network (GNN)-based methods, and a combination paradigm that combines PLMs with knowledge graphs.

3.1. Graph-Metric-Based Methods

The KG triples knowledge is embedded using the graph-metric-based approach into a continuous, low-dimensional embedding representation space, preserving the original entities and relations in the vectors [13]. For example, each entity e in a KG can be represented by a point in vector space, and each relation r can be modeled as an operation-like projection, translation, etc. [25]. Graph-metric-based methods are latent factorization methods that learn a scoring function

f (\cdot)

to measure the plausibility of triplets

(s u b j e c t, p r e d i c a t e, o b j e c t)

. Given

s, o \in E

and

r \in R

, the complex metric embeddings

e_{s}, e_{r}, e_{o} \in C^{d}

are generated; then, by defining a scoring function

f (e_{s}, e_{r}, e_{o})

, the objective of GM-based methods is to minimize the loss function or maximize optimization objectives. In short, typical GM methods generally consist of three steps: (a) embedding entities and relations; (b) defining a scoring function; and (c) learning entity and relation representations [25].

In Figure 3, we show the overall pipeline architecture for the GM-based methods for C-KGQA. Given a question and an available KG, GM-based methods first learn the question embedding and then identify topic entities and constraint embedding. Furthermore, the KG representation in the embedding space is learned to obtain entity embedding and relation embedding to reasoning. The reasoning chain module jointly leverages the embeddings we learned before to provide a set of candidate answer paths and score all candidate entities in KG to obtain the final answer.

The representative GM-based approach utilizes the distance-based scoring function as the distance between the subject entity and object entities. For instance, TransE [26] or RESCAL [27] generate high-dimensional embeddings of entities in real space by learning a scoring function to address questions. However, this approach only considers individual questions and overlooks the intrinsic relationships, thus applying only to solving one-hop problems. Another method is CompLEx [28], a tensor decompositional-based approach, proposing complex-valued embeddings to enhance the modeling of asymmetric relations between entities that are linear in both space and time complexity owing to the benefits of the dot product. This motivated researchers to use this method in their recent work. Saxena et al. [29] proposed a method called KGT5, which tackles link prediction and incomplete KGQA systems as sequence-to-sequence tasks using a single encoder–decoder Transformer model. This approach significantly reduces the model size and maintains scalability compared to traditional KGE models while achieving excellent performance on KGQA tasks over incomplete KGs. RotatE [30] embeds entities and relations into a complex vector space by representing relations as complex rotations. This model excels in handling relations with multiple meanings and performs remarkably well in C-KGQA tasks [31]. Chen et al. [32] proposed a logical query embedding framework called FuzzQE, which is based on fuzzy logic and used for answering complex first-order logical queries on large-scale incomplete graphs. FuzzQE utilizes fuzzy logic to define logical operators and improves performance by learning entity and relation embeddings. Jin et al. [13] applied an additional inference process called the relational chain reasoning module to prune the candidate entities ranked by the answer filtering module. They utilized long short-term memory (LSTM) networks [33] and the RoBERTa model to learn the semantic similarity between the relationship chains in the question description and the KG relationship chains. They also leveraged external supervision signals from the relational chains in the training samples to assist the inference. The effectiveness of this method in answering complex multi-hop questions was demonstrated. The success of knowledge graph embedding in diverse real-world NLP tasks inspired us to delve into its potential benefits within the KGQA task.

In Table 2, we have summarized the research in recent years that utilizes GM-based knowledge graph embedding models to enhance C-KGQA. We have analyzed their features, and we describe the memory complexity of the state-of-the-art KGE model [25], as well as the studies that utilize them, to facilitate researchers with similar interests in accessing this information.

3.2. Graph Neural Network (GNN)-Based Methods

Methods based on GNN are utilized to retrieve topic-specific graphs from the entire knowledge graph. Generally, entities one hop away from the topic entity and their connections form subgraphs that can solve simple questions. The NLQ and candidate answer in the subgraph can be represented as low-dimensional dense vectors. Different ranking functions [36,37] are proposed to calculate the similarity score of these candidate answers, and the entity with the highest score is considered the final predicted answer. Then, a memory network [38] is employed to generate the final answer. Recent works [39,40] have introduced attention mechanisms and multi-column modules to improve the ranking accuracy in this framework. We summarize the procedure of GNN-based methods into four modules, as illustrated in Figure 4.

Question Parsing: In this module, the question is encoded into a vector representation firstly to extract the topic entity and relationship, and then extract a subgraph related to the topic entity. This process [41] emphasizes the learning of the implicit mention of the question topic entity in the context and decomposes the question into multiple sub-questions. However, it is worth noting that they only consider single-hop relations, so the performance may be limited on problems involving multi-hop relations. Chen et al. [32] proposed a method for generating questions from knowledge graph subgraphs, aiming to generate natural language questions based on the subgraph and target answers. The approach utilizes a bidirectional Graph2Seq model to encode the KG subgraph, capturing both the context and structure information. Additionally, a node-level copying mechanism is employed in the RNN decoder to directly copy node attributes from the subgraph into the generated questions.

Subgraph generation: Starting from the topic entity, a subgraph related to a specific question is selected from the knowledge graph based on entity linking. This subgraph uses all entities and relationships related to the question as nodes and edges, respectively. Ideally, it should contain all relevant information for answering the question. Some graph filtering algorithms, such as Personalized PageRank (PPR) [42], can be used to trim the subgraph. Das et al. [37] proposed a semiparametric model that answers queries requiring subgraph reasoning patterns by leveraging the structural similarity between different subgraphs. The approach consists of a nonparametric component that dynamically retrieves similar queries and subgraphs, and a parametric component that identifies reasoning patterns from the subgraphs of the nearest neighbor queries and applies them to the subgraph of the target query. Additionally, the authors introduced an adaptive subgraph collection strategy to select a query-specific compact subgraph, enabling scalability on large-scale knowledge bases.

Subgraph retriever: A graph-based reasoning module converts the questions and candidate answers from the subgraph into low-dimensional dense vectors and performs semantic matching through vector-based calculation. During this process, the module propagates and aggregates information along adjacent entities, obtains path information, and, finally, updates according to the reasoning instructions. Retrieving subgraphs can simplify the reasoning process, and Zhang et al. [36] proposed a trainable subgraph retriever (SR) implemented via a dual encoder for KGQA. SR is decoupled from the subsequent reasoning process, allowing it to be combined with any subgraph-oriented KBQA model, creating a plug-and-play framework. Furthermore, when combined with NSM (a subgraph-oriented reasoner), SR achieved state-of-the-art performance in embedding-based KBQA methods through weakly supervised pre-training and end-to-end fine-tuning.

Answer generation: The answer generation module generates answers based on the final state of reasoning.

The method based on GNN can answer complex questions by constructing subgraphs and extracting relationships within them. However, this method often faces problems, such as excessively large or small subgraphs and poor information extraction in practical applications. To solve these problems, some improvement measures can be adopted, such as setting appropriate limiting conditions in advance to restrict the types of entities and relationships, narrowing down the scope of candidate subgraphs, and reducing redundant information. The efficiency of subgraph retrievers can be improved through policy adjustment; self-supervised methods can also be applied to improve the accuracy of answer sorting. We have compiled recent major improvements in this area and organized them in Table 3. With these improvements, we can more accurately answer complex knowledge graph question-answering tasks.

3.3. The Joint Reasoning of PLM+KG

Training language models on a large-scale text corpus through unsupervised pre-training and then fine-tuning the pre-trained language models on downstream tasks have become a popular paradigm in natural language processing [45]. This approach can also be used to improve the capabilities of knowledge graph question-answering systems. Pre-trained language models based on a Transformer architecture have achieved the greatest success thus far, such as BERT [46] and its variants [47] and a series model of GPT [48]. By pre-training on large-scale text data, these models can learn powerful language representation abilities that can be applied to a variety of natural language processing tasks. Similarly, we summarize the procedure of the PLM+KG-based method into three modules, as illustrated in Figure 5. The challenge of the method of PLM+KG is how to bridge the semantic gap between natural language problems and structured data; thus, we add the solution to the pipeline from the current research.

LM encoding: Due to the strong language representation ability of pre-trained models, some researchers adapt PLMs to the task of NLP. The main technique used by pre-trained language models is the Transformer structure, which uses a self-attention mechanism to extract important information from the question sequence and encode it into vector representations. At the same time, the Transformer has the characteristic of improving the representational ability by stacking multiple layers. Finally, the pre-trained language model encodes the entire natural language question into a vector sequence; then, in the joint reasoning module, the problem representation is further updated by combining knowledge graph information. In the KGQA task, the pre-trained model is fine-tuned to obtain the optimal problem representation. Phuc Do et al. [49] proposed an enhanced question-answering system that utilizes the BERT model and a knowledge graph. The system incorporates two models based on BERT: a triple classification model and a text classification model. The triple classification model is built using BERT and classifies triples generated from all meta paths. The text classification model, also based on BERT, handles the content of triples by converting them into text and addressing text classification problems.

Knowledge Representation Learning: To represent entities and relationships in knowledge graphs as vector forms, some graph embedding algorithms are widely used in knowledge graph representation learning. These algorithms map entities and relationships in knowledge graphs to low-dimensional vector spaces by maximizing the similarity between neighboring nodes and minimizing the similarity between non-neighbor nodes so that more efficient computation and reasoning can be performed. For example, Sun et al. [50] used the GCN representation learning method to learn node embedding by utilizing local neighborhood structure information, generating node representation by iteratively aggregating each entity and its surrounding relationships. At the same time, as a graph neural network based on attention mechanism, graph attention networks (GATs) [51] can aggregate neighboring nodes to different degrees to better capture the relationships between nodes, thereby generating node representation.

Joint Reasoning: Combining pre-trained language models and knowledge graph representation learning, a joint reasoning framework is constructed to jointly represent natural language questions and entities/relationships in the knowledge graph. In this framework, the pre-trained language model generates a question representation vector by inputting a natural language question, while knowledge graph representation learning generates a knowledge graph representation vector by inputting entities and relationships. These two vectors can be fused using some fusion strategies to obtain the final question–answer pairs. Wang et al. [52] designed an interactive scheme after each layer of the LM and GNN, allowing bidirectional information exchange between the two modalities through specially initialized interaction representations (interaction token for the LM and interaction token for the GNN).

Answer Prediction: After the joint reasoning module updates the representation of the question and knowledge graph, their combined representation is obtained. In this module, the score of the prediction result, which indicates the probability of the answer being correct, is computed based on the combined representation. In addition to the basic framework, researchers are continuously exploring new technologies and methods, such as attention mechanisms, multi-task learning, transfer learning, and semi-supervised learning.

The main challenge of these classes of methods is how to better integrate information between the question and the knowledge graph for joint reasoning. We have summarized the strategies of the latest methods and organized them in Table 4. In addition to the basic framework, researchers are continuously exploring new technologies and methods, such as attention mechanisms, multi-task learning, transfer learning, and semi-supervised learning [53], to improve the efficacy and generalization ability of the models. Furthermore, some language models, such as T5 [54] and GShard [55], have been proposed to integrate multiple tasks and languages during pre-training, which can further enhance the generalization ability and adaptability of models [56].

4. Resource and Evaluation

In this section, we provide a detailed overview of the resources and evaluation metrics required to address complex problem-solving tasks. We discuss the knowledge graphs and datasets used in different subtasks and introduce evaluation metrics from various perspectives. In addition to traditional evaluation metrics, we acknowledge the emergence of new perspectives for assessing model performance with the advent of pre-trained models. We will delve into these perspectives in the upcoming chapters, offering a thorough exploration of their significance.

4.1. Resource

In this section, we will introduce the knowledge graphs and datasets that serve the tasks of C-KGQA. We provide an overview of commonly used general-purpose knowledge graphs and datasets. Additionally, we discuss the knowledge graphs and datasets [59] specific to certain subtasks or competitions. We acknowledge the importance of the availability of these resources and their impact on research and development.

4.1.1. KGs

Large-scale and general-purpose knowledge graphs, such as Freebase [6], YAGO [9], DBpedia [7], and Wikidata [8], are usually used to retrieve answers to the tasks of question and answer over knowledge graph. These systems are primarily designed to tackle open-domain problems. For commonsense questions, ConceptNet [23] is typically used as the knowledge source for retrieval.

Freebase [6] is an open and collaborative database, in which nodes are defined using

/ t y p e / o b j e c t

and links using

/ t y p e / l i n k

. It is based on the RDF triple model [18] and stored in a graph database. Freebase enables quick traversal of arbitrary connections between topics and allows the easy addition of new schemas without changing the structure of the data. It should be noted that Freebase has been deprecated, but most current research works still use it.

DBpedia [7] is built by automatically extracting structured knowledge from Wikipedia pages and is one of the largest multi-domain knowledge bases in the world. CN-DBpedia is a Chinese knowledge graph that integrates general and domain-specific data.

YAGO [9] is a high-quality triple store that covers a broad range of concepts. It integrates data from Wikipedia, WordNet, and GeoNames, and links the highly accurate Wikipedia and WordNet sources together.

Wikidata [8] originated from Wikipedia and supports free collaborative editing in multilingual formats. It has been used as the knowledge base for Wikipedia and uses pages as the basic organizational unit, where entities refer to the top-level objects.

4.1.2. Datasets

Researchers have constructed diverse QA datasets to address different tasks. We categorize datasets based on the sub-tasks of complex knowledge graph question answering, and compile a list of datasets that contain constraint questions, as shown in Table 5, along with their source KGs, and an indication of providing corresponding query statements.

We have compiled a list of typical datasets in the C-KGQA field based on different sub-tasks, as shown in Table 5. The datasets for different sub-tasks are arranged according to the year of their creation. GrailQA [60] is a large-scale and high-quality knowledge graph question-answering dataset on Freebase, consisting of 64,331 questions labeled with answers and corresponding logic forms in different grammars such as SPARQL and S-expressions. It can be used to test three levels of generalization, which are i.i.d., compositional, and zero-shot in the task of KGQA; KQAPro [61] is a large-scale dataset containing nearly 120k natural language questions with challenging multi-hop reasoning, attribute comparison, set operation, etc. This dataset offers the KoPL and SPARQL queries for each question, making it valuable for evaluating KGQA system performance, as well as comparing the effectiveness of different algorithms. Moreover, the release of this dataset provides new resources for further research and development in KGQA technology.

The QALD challenge has become a well-established competition in terms of KGQA on DBpedia facts. Based on the current feasibility, we have chosen two datasets to list in the table. The test dataset of LC-QuAD contains 1000 question–query pairs [62], and the training dataset contains 4000 question–query pairs. Compared to many other knowledge graph question-answering datasets, LC-QuAD is a relatively difficult dataset because its questions involve complex cross-link reasoning. Specifically, each sample in LC-QuAD consists of two parts: a question and a query graph. The question is a natural language question, while the query graph is a graphical representation of the corresponding SPARQL query. This enables LC-QuAD to be used to evaluate the performance of knowledge graph question-answering models based on graph neural networks. LC-QuAD2.0 [63] is an upgraded version of the LC-QuAD, which contains over 30,000 questions, 57 knowledge graphs, and approximately 185,000 logical expressions. Similar to LC-QuAD, this dataset is more difficult and extends the diversity and complexity of the questions. In contrast to LC-QuAD, the questions in LC-QuAD2.0 not only involve cross-link reasoning but also include more complex question types such as entity relation identification, property value calculation, and entity retrieval. Additionally, the dataset contains more phrases, named entities, properties, and categories, as well as more KG architectures and languages.

Table 5. Introduction of datasets: “LB” means Leaderboard; “MC” means metric.

Datasets	KG	Year	Size			LB Score ¹		MC ²
ComplexQuestions [64]	Freebase [6]	2016	2100			43.3 ²⁰		Acc, P, R, F1
						42.9 ²²
						42.8 ²⁰
ComplexWebQuestions [19]	Freebase [6]	2018	34,689			70.4 ²¹		P, F1
						53.9 ²¹
						50 ²⁰
GrailQA [60]	Freebase [6]	2021	64,331		73.42 ²³		81.87 ²³	EM, F1
					75.38 ²²		81.70 ²²
					76.31 ²²		81.52 ²²
KQApro [61]	Wikidata [8]	2022	117,970			95.32 ²³		Acc, F1
						93.85 ²²
						92.45 ²¹
LC-QuAD [62]	DBpedia [7]	2018	5000	-		-	91 ²²	P, R, F1
				88 ²³		56 ²³	68 ²³
				88.11 ²²		83.04 ²²	83.08 ²²
LC-QuAD2.0 [63]	DBpedia [7] and Wikidata [8]	2019	30,000			92 ²² 91 ²² 86 ²²		F1

¹ We list the score in the order of the metric listed in the MC. The superscript on the scores indicates the year of the update. To simplify the table, we have only listed the scores of the top three, with the highest score highlighted in bold. ² The metrics in the table: “Acc”—accuracy; “P”—precision; “R”—recall; “F1”—F1 score; “EM”—Exact Match.

In this section, we have introduced several commonly used datasets in C-KGQA tasks. We provided information on both traditional and newly released firing datasets, as well as some datasets used in competitions. Due to the limitations of the table, we have provided an official leaderboard and detailed information links for the aforementioned datasets to facilitate researchers with similar interests accessing them. We have noticed that some scores on the leaderboards lack clear sources, so we have excluded those from the scores listed in the table.

4.2. Metrics

To comprehensively evaluate KGQA systems, effective evaluation needs to be conducted from multiple perspectives. The main evaluation methods can be divided into two aspects: evaluating the reliability and robustness of the model.

4.2.1. Reliability

In KGQA tasks, for each question, there is an answer set containing its answer. The answer with the highest confidence score is usually selected as the answer to the question. Let Q be the set of questions, q be a single question in Q, A be the answer set, and

\hat{A}

represent the predicted answer.

Precision measures the proportion of correct predictions for a single question among all predicted answers. Its formula is as follows:

P r e c i s i o n = \frac{| A_{q} \cap \hat{A_{q}} |}{| \hat{A_{q}} |}

(1)

Accuracy represents the proportion of correctly answered questions by the system among all questions. Its formula is as follows:

A c c u r a c y = \frac{| A_{q} \cap \hat{A_{q}} |}{| Q |}

(2)

Hist@K is a metric defined based on accuracy. If the answer list is sorted in a certain order, and at least one correct answer exists among the first K answers, it is called a hit (1 hit); otherwise, if there is no correct answer among the first K answers, it is regarded as a miss (0 hit). Finally, the performance of the system will be evaluated by calculating the average number of hits for the entire set of questions. The value of K is defined manually based on the specific requirements of the task.

Recall indicates the proportion of correct answers predicted by the system out of all correct answers for a single question. Its formula is expressed as follows:

R e c a l l = \frac{| A_{q} \cap \hat{A_{q}} |}{| A_{q} |}

(3)

F1 denotes the weighted average of precision and recall, reflecting the comprehensive performance of the system. Its calculation formula is expressed as follows:

F 1 = 2 \times \frac{p r e c i s i o n * r e c a l l}{p r e c i s i o n + r e c a l l}

(4)

For the above metrics, there are corresponding precision@K, recall@K, and F1@K metrics, the calculation method of which is the same as Hits@K. The scope of examination K is manually specified, and the first K answers returned by the system are used as the scope of examination for calculation.

4.2.2. Robustness

The actual KGQA models need to be extended to practical daily applications. However, most of the current datasets are based on template collection, lacking diversity, which cannot train the generalizability of models. Moreover, training datasets are limited by expensive manual costs, and the current training data cannot cover all types of user queries. To improve the robustness of KGQA models, Gu et al. [60] proposed three levels of generalization as follows:

i.i.d. generation: i.i.d. (independent and identically distributed) generation, which is an important concept in probability and statistics. In the KGQA task, the concept of i.i.d. is mainly reflected in the construction and evaluation of datasets. To ensure that a KGQA model has good generalization performance, independent and identically distributed data should be used for training and testing.

Compositional generation: Compositional generalization refers to the ability of a model to utilize the previously learned knowledge structure and rules to generate new entities and relations by combining known entities and relations to adapt to different problem domains when processing new queries. Compositional generalization can be considered a structured and recursive way of modeling that is very effective in handling various complex KGQA tasks. Compositional generalization methods need to be adjusted and optimized according to specific KG and task characteristics. Generally, it is needed to combine techniques such as language representation learning and graph neural networks to achieve efficient and accurate KGQA systems.

Zero-shot: Zero-shot refers to the ability of a model to answer related questions correctly without encountering entities and relations that have never been seen during the model training. Although these entities and relations have not appeared in the training set, the model can utilize prior knowledge and infer the correct answer via reasoning, composition, and other methods. Zero-shot generalization enables the model to have a stronger generalization performance, which is suitable for the question-answering scenarios of new entities and relations in practical applications.

The level of generalization represents the expectation for a KGQA model: a model should be able to handle questions related to its training objects at a minimum. Furthermore, it should be extended to new combinations of trained structures. Ideally, a model should also be able to handle new patterns, even entire domains not covered by limited training data, including combinations of new structures. It is believed that models with improved generalization capabilities can be created by explicitly identifying several generalization levels. A practical KGQA model should also have strong generalization abilities to language variations and different reformulations for the same logical form. Some works [65,66,67] utilize generative models to address the issue of coverage and have achieved good performance on GrailQA.

5. Trends and Conclusions

This article provides a comprehensive introduction to complex knowledge graph question answering, including the task definition, main challenges and solutions, current mainstream research methods, and popular datasets. It also indicates the bottleneck challenge in this field, which is the generalization ability of a model. In addition, there are still some challenges in model performance:

(a) Multilingual question answering: Currently, most knowledge graph question-answering systems are designed for English questions. However, efficient QA systems are needed for users in different languages worldwide. Therefore, how to extend knowledge graph question-answering technology to multiple languages is a future problem to be solved.

(b) Incremental learning: When the knowledge graph is updated, using incremental learning methods to update existing node representations can maintain the accuracy and real-time performance of the knowledge graph.

(c) Real-time application optimization: In real-time question-answering applications, fast response time is essential, so optimizing system response speed and efficiency for real-time applications is required. Tian et al. [68] proposed a novel knowledge distillation (KD) method for incremental learning. The designed loss function can transfer the feature knowledge from PLMs to the incremental learning model, providing a new perspective for knowledge extraction from PLMs.

(d) Multi-modal question answering: Future question-answering systems need to support multiple forms of input, such as text, images, and voice [69], to meet the diverse needs of users. For example, Ning et al. [70] proposed a novel method called Differentiable Image–Language Fusion (DILF) for multi-view image and language fusion. This approach utilizes large-scale language models (LLMs) to generate text prompts with rich 3D semantics, enabling efficient fusion between image and text information.

Author Contributions

Conceptualization, Y.S.; methodology, Y.S.; formal analysis, Y.S.; investigation, G.D.; data curation, Y.S.; writing—original draft preparation, Y.S.; writing—review and editing, Y.S. and G.D.; visualization, Y.S.; supervision, X.S. and W.L.; project administration, X.S.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61972040), and the collaborative innovation project of Chaoyang District, Beijing (CYXC2204).

Data Availability Statement

The datasets and the leaderboard mentioned in this article can be found at https://github.com/KGQA/leaderboard.

Acknowledgments

The authors would like to express their sincere gratitude to the editors and reviewers for their valuable comments and helpful suggestions, which have greatly enhanced the quality of this paper.

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper.

References

Balažević, I.; Allen, C.; Hospedales, T.M. Tucker: Tensor factorization for knowledge graph completion. arXiv 2019, arXiv:1901.09590. [Google Scholar]
Jiang, Z.; Chi, C.; Zhan, Y. Research on medical question answering system based on knowledge graph. IEEE Access 2021, 9, 21094–21101. [Google Scholar] [CrossRef]
Guo, Q.; Cao, S.; Yi, Z. A medical question answering system using large language models and knowledge graphs. Int. J. Intell. Syst. 2022, 37, 8548–8564. [Google Scholar] [CrossRef]
Hou, X.; Zhu, C.; Li, Y.; Wang, P.; Peng, X. Question answering system based on military knowledge graph. In Proceedings of the International Conference on Electronic Information Engineering and Computer Communication (EIECC 2021), Changchun, China, 23–26 September 2021; SPIE: Bellingham, WA, USA, 2022; Volume 12172, pp. 33–39. [Google Scholar]
Huang, J.; Chen, Y.; Li, Y.; Yang, Z.; Gong, X.; Wang, F.L.; Xu, X.; Liu, W. Medical knowledge-based network for Patient-oriented Visual Question Answering. Inf. Process. Manag. 2023, 60, 103241. [Google Scholar] [CrossRef]
Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, BC, Canada, 10–12 June 2008; pp. 1247–1250. [Google Scholar]
Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 2015, 6, 167–195. [Google Scholar] [CrossRef]
Pellissier Tanon, T.; Vrandečić, D.; Schaffert, S.; Steiner, T.; Pintscher, L. From freebase to wikidata: The great migration. In Proceedings of the 25th International Conference on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 1419–1428. [Google Scholar]
Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web, Banff, AB, Canada, 8–12 May 2007; pp. 697–706. [Google Scholar]
Lan, Y.; He, G.; Jiang, J.; Jiang, J.; Zhao, W.X.; Wen, J.R. Complex knowledge base question answering: A survey. IEEE Trans. Knowl. Data Eng. 2022, 35, 11196–11215. [Google Scholar] [CrossRef]
Mitra, S.; Ramnani, R.; Sengupta, S. Constraint-based Multi-hop Question Answering with Knowledge Graph. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, Online, Seattle, WA, USA, 10–15 July 2022; pp. 280–288. [Google Scholar]
Gomes, J., Jr.; de Mello, R.C.; Ströele, V.; de Souza, J.F. A study of approaches to answering complex questions over knowledge bases. Knowl. Inf. Syst. 2022, 64, 2849–2881. [Google Scholar] [CrossRef]
Jin, W.; Zhao, B.; Yu, H.; Tao, X.; Yin, R.; Liu, G. Improving embedded knowledge graph multi-hop question answering by introducing relational chain reasoning. Data Min. Knowl. Discov. 2023, 37, 255–288. [Google Scholar] [CrossRef]
Bi, X.; Nie, H.; Zhang, G.; Hu, L.; Ma, Y.; Zhao, X.; Yuan, Y.; Wang, G. Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision. Inf. Process. Manag. 2023, 60, 103242. [Google Scholar] [CrossRef]
Wu, P.; Zhang, X.; Feng, Z. A survey of question answering over knowledge base. In Proceedings of the Knowledge Graph and Semantic Computing: Knowledge Computing and Language Understanding: 4th China Conference, CCKS 2019, Hangzhou, China, 24–27 August 2019, Revised Selected Papers 4; Springer: Berlin/Heidelberg, Germany, 2019; pp. 86–97. [Google Scholar]
Zhang, L.; Zhang, J.; Ke, X.; Li, H.; Huang, X.; Shao, Z.; Cao, S.; Lv, X. A survey on complex factual question answering. AI Open 2023, 4, 1–12. [Google Scholar] [CrossRef]
Wang, X.; Yang, S. A tutorial and survey on fault knowledge graph. In Proceedings of the Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health: International 2019 Cyberspace Congress, CyberDI and CyberLife, Beijing, China, 16–18 December 2019; Proceedings, Part II 3. Springer: Berlin/Heidelberg, Germany, 2019; pp. 256–271. [Google Scholar]
Beckett, D.; Berners-Lee, T.; Prud’hommeaux, E.; Carothers, G. RDF 1.1 Turtle. World Wide Web Consort. 2014, 18–31. [Google Scholar]
Talmor, A.; Berant, J. The web as a knowledge-base for answering complex questions. arXiv 2018, arXiv:1803.06643. [Google Scholar]
Francis, N.; Green, A.; Guagliardo, P.; Libkin, L.; Lindaaker, T.; Marsault, V.; Plantikow, S.; Rydberg, M.; Selmer, P.; Taylor, A. Cypher: An evolving query language for property graphs. In Proceedings of the 2018 International Conference on Management of Data, Houston, TX, USA, 10–15 June 2018; pp. 1433–1445. [Google Scholar]
Liang, P. Lambda dependency-based compositional semantics. arXiv 2013, arXiv:1309.4408. [Google Scholar]
Kilgarriff, A. Wordnet: An Electronic Lexical Database; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
Speer, R.; Chin, J.; Havasi, C. Conceptnet 5.5: An open multilingual graph of general knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
Dong, Z.; Dong, Q. HowNet-a hybrid language and knowledge resource. In Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 26–29 October 2003; IEEE: Piscataway, NJ, USA, 2003; pp. 820–824. [Google Scholar]
Zamini, M.; Reza, H.; Rabiei, M. A review of knowledge graph completion. Information 2022, 13, 396. [Google Scholar] [CrossRef]
Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Neural Information Processing Systems-Volume 2; Curran Associates Inc.: Red Hook, NY, USA, 2013; pp. 2787–2795. [Google Scholar]
Nickel, M.; Tresp, V.; Kriegel, H.P. A three-way model for collective learning on multi-relational data. In Proceedings of the ICML, Bellevue, WA, USA, 28 June–2 July 2011; Volume 11, pp. 3104482–3104584. [Google Scholar]
Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex embeddings for simple link prediction. In Proceedings of the International Conference on Machine Learning, PMLR, New York, NY, USA, 20–22 June 2016; pp. 2071–2080. [Google Scholar]
Saxena, A.; Kochsiek, A.; Gemulla, R. Sequence-to-sequence knowledge graph completion and question answering. arXiv 2022, arXiv:2203.10321. [Google Scholar]
Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv 2019, arXiv:1902.10197. [Google Scholar]
Omar, R.; Dhall, I.; Kalnis, P.; Mansour, E. A universal question-answering platform for knowledge graphs. Proc. ACM Manag. Data 2023, 1, 1–25. [Google Scholar] [CrossRef]
Chen, X.; Hu, Z.; Sun, Y. Fuzzy logic based logical query answering on knowledge graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 3939–3948. [Google Scholar]
Gao, J.; Yu, H.; Zhang, S. Joint event causality extraction using dual-channel enhanced neural network. Knowl.-Based Syst. 2022, 258, 109935. [Google Scholar] [CrossRef]
Yang, B.; Yih, W.t.; He, X.; Gao, J.; Deng, L. Embedding entities and relations for learning and inference in knowledge bases. arXiv 2014, arXiv:1412.6575. [Google Scholar]
Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Zhang, J.; Zhang, X.; Yu, J.; Tang, J.; Tang, J.; Li, C.; Chen, H. Subgraph retrieval enhanced model for multi-hop knowledge base question answering. arXiv 2022, arXiv:2202.13296. [Google Scholar]
Das, R.; Godbole, A.; Naik, A.; Tower, E.; Zaheer, M.; Hajishirzi, H.; Jia, R.; McCallum, A. Knowledge base question answering by case-based reasoning over subgraphs. In Proceedings of the International Conference on Machine Learning. PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 4777–4793. [Google Scholar]
Sukhbaatar, S.; Szlam, A.; Weston, J.; Fergus, R. End-to-end memory networks. Adv. Neural Inf. Process. Syst. 2015, 2015, 2440–2448. [Google Scholar]
Hao, Y.; Zhang, Y.; Liu, K.; He, S.; Liu, Z.; Wu, H.; Zhao, J. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, QC, Canada, 30 July–4 August 2017; Volume 1, pp. 221–231. [Google Scholar]
Chen, Z.Y.; Chang, C.H.; Chen, Y.P.; Nayak, J.; Ku, L.W. UHop: An unrestricted-hop relation extraction framework for knowledge-based question answering. arXiv 2019, arXiv:1904.01246. [Google Scholar]
Shen, T.; Geng, X.; Qin, T.; Guo, D.; Tang, D.; Duan, N.; Long, G.; Jiang, D. Multi-task learning for conversational question answering over a large-scale knowledge base. arXiv 2019, arXiv:1910.05069. [Google Scholar]
Lofgren, P. Efficient Algorithms for Personalized Pagerank; Stanford University: Stanford, CA, USA, 2015. [Google Scholar]
Qiu, Y.; Zhang, K.; Wang, Y.; Jin, X.; Bai, L.; Guan, S.; Cheng, X. Hierarchical query graph generation for complex question answering over knowledge graph. In Proceedings of the 29th ACM International Conference on Information Knowledge Management, Virtual Event, 19–23 October 2020; pp. 1285–1294. [Google Scholar]
Chen, Y.; Wu, L.; Zaki, M.J. Toward Subgraph-Guided Knowledge Graph Question Generation With Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–12. [Google Scholar] [CrossRef] [PubMed]
Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the opportunities and risks of foundation models. arXiv 2021, arXiv:2108.07258. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
Do, P.; Phan, T.H. Developing a BERT based triple classification model using knowledge graph embedding for question answering system. Appl. Intell. 2022, 52, 636–651. [Google Scholar] [CrossRef]
Sun, Y.; Shi, Q.; Qi, L.; Zhang, Y. JointLK: Joint reasoning with language models and knowledge graphs for commonsense question answering. arXiv 2021, arXiv:2112.02732. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Wang, Y.; Zhang, H.; Liang, J.; Li, R. Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; Volume 1, pp. 14048–14063. [Google Scholar]
Zhang, Q.; Chen, S.; Fang, M.; Chen, X. Joint reasoning with knowledge subgraphs for Multiple Choice Question Answering. Inf. Process. Manag. 2023, 60, 103297. [Google Scholar] [CrossRef]
Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 5485–5551. [Google Scholar]
Lepikhin, D.; Lee, H.; Xu, Y.; Chen, D.; Firat, O.; Huang, Y.; Krikun, M.; Shazeer, N.; Chen, Z. Gshard: Scaling giant models with conditional computation and automatic sharding. arXiv 2020, arXiv:2006.16668. [Google Scholar]
Jiao, S.; Zhu, Z.; Wu, W.; Zuo, Z.; Qi, J.; Wang, W.; Zhang, G.; Liu, P. An improving reasoning network for complex question answering over temporal knowledge graphs. Appl. Intell. 2023, 53, 8195–8208. [Google Scholar] [CrossRef]
Yasunaga, M.; Ren, H.; Bosselut, A.; Liang, P.; Leskovec, J. QA-GNN: Reasoning with language models and knowledge graphs for question answering. arXiv 2021, arXiv:2104.06378. [Google Scholar]
Yasunaga, M.; Bosselut, A.; Ren, H.; Zhang, X.; Manning, C.D.; Liang, P.S.; Leskovec, J. Deep bidirectional language-knowledge graph pretraining. Adv. Neural Inf. Process. Syst. 2022, 35, 37309–37323. [Google Scholar]
Tan, Y.; Chen, Y.; Qi, G.; Li, W.; Wang, M. MLPQ: A Dataset for Path Question Answering over Multilingual Knowledge Graphs. Big Data Res. 2023, 32, 100381. [Google Scholar] [CrossRef]
Gu, Y.; Kase, S.; Vanni, M.; Sadler, B.; Liang, P.; Yan, X.; Su, Y. Beyond IID: Three levels of generalization for question answering on knowledge bases. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 3477–3488. [Google Scholar]
Cao, S.; Shi, J.; Pan, L.; Nie, L.; Xiang, Y.; Hou, L.; Li, J.; He, B.; Zhang, H. KQA pro: A dataset with explicit compositional programs for complex question answering over knowledge base. arXiv 2020, arXiv:2007.03875. [Google Scholar]
Trivedi, P.; Maheshwari, G.; Dubey, M.; Lehmann, J. Lc-quad: A corpus for complex question answering over knowledge graphs. In Proceedings of the Semantic Web–ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, 21–25 October 2017; Proceedings, Part II 16. Springer: Berlin/Heidelberg, Germany, 2017; pp. 210–218. [Google Scholar]
Dubey, M.; Banerjee, D.; Abdelkawi, A.; Lehmann, J. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In Proceedings of the Semantic Web–ISWC 2019: 18th International Semantic Web Conference, Auckland, New Zealand, 26–30 October 2019; Proceedings, Part II 18. Springer: Berlin/Heidelberg, Germany, 2019; pp. 69–78. [Google Scholar]
Bao, J.; Duan, N.; Yan, Z.; Zhou, M.; Zhao, T. Constraint-based question answering with knowledge graph. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 2503–2514. [Google Scholar]
Ye, X.; Yavuz, S.; Hashimoto, K.; Zhou, Y.; Xiong, C. RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 6032–6043. [Google Scholar]
Sun, Y.; Zhang, Y.; Qi, L.; Shi, Q. TSGP: Two-Stage Generative Prompting for Unsupervised Commonsense Question Answering. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 968–980. [Google Scholar]
Madani, N.; Joseph, K. Answering Questions Over Knowledge Graphs Using Logic Programming Along with Language Models. arXiv 2023, arXiv:2303.02206. [Google Scholar]
Tian, S.; Li, W.; Ning, X.; Ran, H.; Qin, H.; Tiwari, P. Continuous transfer of neural network representational similarity for incremental learning. Neurocomputing 2023, 545, 126300. [Google Scholar] [CrossRef]
Ran, H.; Ning, X.; Li, W.; Hao, M.; Tiwari, P. 3D human pose and shape estimation via de-occlusion multi-task learning. Neurocomputing 2023, 548, 126284. [Google Scholar] [CrossRef]
Ning, X.; Yu, Z.; Li, L.; Li, W.; Tiwari, P. DILF: Differentiable rendering-based multi-view Image-Language Fusion for zero-shot 3D shape understanding. Inf. Fusion 2023, 102, 102033. [Google Scholar] [CrossRef]

Figure 1. An example of a complex question with constraint and multi-hop complexity.

Figure 2. “Jackie Chan” and their information in Wikipedia as an example of Resource Description Framework.

Figure 3. The pipeline of graph-metric-based method.

Figure 4. The pipeline of GNN-based methods.

Figure 5. The pipeline of the joint reasoning of PLM and KG.

Table 1. Notation and problem definition for each triplet used in the following sections.

Notation	Description
$e_{s}, e_{r}, e_{o}$	Embedding vector of subject entity, relation, and object entity
$E, R$	Entity set and relation set
$(s, r, o)$	(Subject entity, relation, and object entity)
N	Number of entities
d	Dimensionality of embeddings
$M_{r}$	The matrix of relation
$\hat{e_{s}}, \hat{e_{r}}$	Two-dimensional vectors of subject entity and relation
w	Kernel

Table 2. Different KGE models used in C-KGQA.

Categories	KGE Model	Score Function	Memory Complexity
Distanced-based	TransE [26]	${∥ s + o - r ∥}_{l_{1} / l_{2}}$	$O (N_{e} d + N_{r} d)$
Distanced-based	RotatE [30]	$- ∥ s ⊙ r - o ∥$	$O (2 N_{e} d + 2 N_{r} d)$
Tensor-decompositional-based	ComplEx [28]	$R e (〈 s, r, \hat{o} 〉)$	$O (2 N_{e} d + 2 N_{r} d)$
	RESCAL [27]	$s^{⊤} M_{r} o$	$O (N_{e} d + N_{r} d^{2})$
	DistMult [34]	$s^{⊤} d i a g (M_{r}) o$	$O (N_{e} d + N_{r} d)$
Convolutional-based	ConvE [35]	$\begin{matrix} f (v e c (f ([\hat{e_{s}}; \hat{e_{r}}] \\ w)) M) e_{o} \end{matrix}$	$\begin{matrix} O (N_{e} d + N_{r} d + T m_{w} n_{w} \\ + T d (2 d_{m} - m_{w} + 1) (d_{n} - n_{w} + 1)) \end{matrix}$

Table 3. Different strategies of subgraph retrievers in recent works.

Model	Year	Strategy
HQGC [43]	2020	Based the director–actor–critic framework on hierarchical reinforcement learning with intrinsic motivation
CBR-SUBG [37]	2022	Dynamically retrieves similar queries and subgraphs; adaptive subgraph collection strategy
SR [36]	2022	Trainable subgraph retriever implemented via a dual encoder
Graph2seq [44]	2023	Utilized a bidirectional Graph2Seq model to encode the KG subgraph

Table 4. Different strategies for integrating two modalities.

Model	Year	Strategy
Qagnn [57]	2020	Connect question and KG to form a joint graph.
JointLK [50]	2022	A dense bidirectional attention module.
GreaseLM [58]	2022	Designed a special interaction token and passed through N LM-based unimodal encoding layers.
DHLK [52]	2023	Proposed a dynamic heterogeneous graph with LMs and KG; relation mask self-attention.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Song, Y.; Li, W.; Dai, G.; Shang, X. Advancements in Complex Knowledge Graph Question Answering: A Survey. Electronics 2023, 12, 4395. https://doi.org/10.3390/electronics12214395

AMA Style

Song Y, Li W, Dai G, Shang X. Advancements in Complex Knowledge Graph Question Answering: A Survey. Electronics. 2023; 12(21):4395. https://doi.org/10.3390/electronics12214395

Chicago/Turabian Style

Song, Yiqing, Wenfa Li, Guiren Dai, and Xinna Shang. 2023. "Advancements in Complex Knowledge Graph Question Answering: A Survey" Electronics 12, no. 21: 4395. https://doi.org/10.3390/electronics12214395

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advancements in Complex Knowledge Graph Question Answering: A Survey

Abstract

1. Introduction

2. Preliminary

2.1. Knowledge Graph

2.2. Task Formulation

3. Materials and Methods

3.1. Graph-Metric-Based Methods

3.2. Graph Neural Network (GNN)-Based Methods

3.3. The Joint Reasoning of PLM+KG

4. Resource and Evaluation

4.1. Resource

4.1.1. KGs

4.1.2. Datasets

4.2. Metrics

4.2.1. Reliability

4.2.2. Robustness

5. Trends and Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI