Fake News Detection Based on Knowledge-Guided Semantic Analysis

Zhao, Wenbin; He, Peisong; Zeng, Zhixin; Xu, Xiong

doi:10.3390/electronics13020259

Open AccessArticle

Fake News Detection Based on Knowledge-Guided Semantic Analysis

¹

Southwest China Institute of Electronic Technology, Chengdu 610036, China

²

School of Cyber Science and Engineering, Sichuan University, Chengdu 610207, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(2), 259; https://doi.org/10.3390/electronics13020259

Submission received: 18 November 2023 / Revised: 3 January 2024 / Accepted: 3 January 2024 / Published: 5 January 2024

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Recently, fake news, such as low-quality news with intentionally false information, has threatened the authenticity of news information. However, existing detection methods are inefficient in modeling complicated data and leveraging external knowledge. To address these limitations, we propose a fake news detection framework based on knowledge-guided semantic analysis, which compares the news to external knowledge through triplets for fake news detection. Considering that equivalent elements of triplets may be presented in different forms, a triplet alignment method is designed to construct the bridge between news documents and knowledge graphs. Then, a dual-branch network is developed to conduct interaction and comparison between text and knowledge embeddings. Specifically, text semantics is analyzed with the guidance generated by a triplet aggregation module to capture the inconsistency between news content and external knowledge. In addition, a triplet scoring module is designed to measure rationality in view of general knowledge as a complementary clue. Finally, an interaction module is proposed to fuse rationality scores in aspects of text semantics and external knowledge to obtain detection results. Extensive experiments are conducted on publicly available datasets and several state-of-the-art methods are considered for comparison. The results verify the superiority of the proposed method in achieving more reliable detection results of fake news.

Keywords:

fake news detection; triplet alignment; external knowledge graph; text semantic embedding; token and channel mixing mechanism

1. Introduction

Fake news refers to news reports whose content is inconsistent with facts, intentionally misleading, or contains fabricated information. This type of news typically uses social hot spots or sensitive topics to attract public attention, generating adverse impacts and social instability. Moreover, with the development of content recommendation techniques, social media have become important sources of news content (https://wearesocial.com/cn/blog/2022/07/the-global-state-of-digital-in-july-2022-part-one/ (accessed on 26 July 2022)). Consequently, the generation and dissemination of fake news have become faster and easier, posing a considerable challenge in obtaining accurate and reliable information. For example, after the outbreak of the coronavirus disease of 2019 (COVID-19) epidemic, fake news related to medical knowledge affected people’s adherence to the mandated prevention measures [1]. According to reports published in 2023, over 70 percent of Europeans regularly encounter fake news. Moreover, text is the most widely used media format to construct fake news due to its high information density. Therefore, we focus on the detection of news documents (such as news articles and posts) containing false information in this work.

A fake news detection technique is of great significance as a countermeasure. Obviously, it is inefficient to manually expose fake news on social networks, which requires a lot of manpower and results in poor detection performance. Regarding the automatic detection of fake news, researchers have pointed out that semantic inconsistencies are vital clues to verify the authenticity of news information on social networks. In the past, researchers have applied natural language processing technologies to expose fake news. More recently, a range of fake news detection methods have been proposed by incorporating deep learning techniques, such as neural networks. Specifically, text-content-based methods [2,3] typically extract features from headlines, sentences, and writing styles of news documents. However, most existing text-content-based detection methods are insufficient to consider inter-sentence relations, and lead to suboptimal detection results. On the other hand, social-network-based detection methods, such as [4,5], consider the user’s social network data and utilize data mining technology for fake news detection. However, in real-world scenarios, social network platforms impose strict confidentiality agreements regarding users’ private data. In addition, some fake news datasets cannot be publicly accessed due to data privacy, which makes the training of detection models invalid. Therefore, the developed social-network-based detection methods may be difficult to deploy in practical applications. Moreover, researchers have introduced graph neural networks to develop detection methods. Graph-based detection methods [6] design document graphs for input data. The efficiency of constructing a document graph can influence the detection performance. In addition, the process of constructing graphs is complex and cumbersome, which reduces the efficiency and flexibility of such detection methods.

To overcome these limitations, we propose a fake news detection framework based on knowledge-guided semantic analysis, which is inspired by the following intuition: when people read the news and want to check the truth, they first check syntax/semantics of the text and then compare textual information with external resources as fact-checking. In our method, triplets are used to construct the bridge between news documents and knowledge graphs of external knowledge. A dual-branch detection model with heterogeneous architectures is designed to conduct interaction and comparison of text and knowledge embeddings jointly. Specifically, text embedding is extracted by a BERT (Bidirectional Encoder Representation from Transformers) model to encode contextual and semantic information. On the other hand, knowledge embeddings provide structured representations of factual information. More specifically, a triplet aggregation module is constructed to generate the guidance of text semantic analysis for capturing the inconsistency between text content and external knowledge. In addition, a triplet scoring module is designed to measure rationality in view of general knowledge as a complementary clue. Finally, the interaction module is developed to fuse rationality scores in different aspects to obtain the final detection results.

We propose a dual-branch neural network with heterogeneous architectures for fake news detection based on knowledge-guided semantic analysis, which compares the news text to external knowledge for efficiently exposing fake news.
To construct the bridge between the text and external knowledge, triplets are taken into consideration and a fuzzy-matching-based triplet alignment technique is developed to handle the case where equivalent elements are presented in different forms.
To capture the inconsistency between the news content and external knowledge, a triplet aggregation module is developed for obtaining document-level knowledge representation as the guidance of text semantic analysis. In addition, we also consider the rationality of general knowledge as a complementary clue, which is measured by a triplet scoring module in the knowledge embedding space.
Finally, to leverage the complementarity between text semantics and external knowledge, a text and knowledge interaction module with learnable weights is constructed to obtain the final detection results based on rationality scores from different branches.

The rest of this paper is organized as follows: Section 2 briefly introduces related works on fake news detection. Section 3 presents the proposed fake news detection framework in detail. In Section 4, experiments are conducted to evaluate the performance of the proposed method and comparison experiments are conducted with other detection methods. Section 5 draws the conclusions.

2. Related Work

According to the type of detection features, fake news detection techniques can be divided into several categories, including text-content-based, social-network-based, and external-knowledge-based methods [7].

2.1. Text-Content-Based Detection Methods

Text-content-based detection methods mainly use news text to construct detection features to verify the authenticity of news. These include the language style, logical relationship, and other information from news headlines, citations, and text sources. Moreover, the degree of similarity, differences, and contradictions between a suspicious news document and related posts from other sources [7] can also be utilized. These methods can be further divided into text-feature-based and graph-modeling-based detection methods.

2.1.1. Text-Feature-Based Detection Methods

In this type of detection method, vocabulary and syntactic features presented in a particular writing style are extracted for fake news detection. For example, Potthast et al. [8] detected fake news by considering the unique properties of text content based on a meta learning approach. Deep neural networks have also been applied to learn detection features, avoiding the high cost of hand-crafted feature engineering. Kong et al. [9] combined bidirectional long short-term memory (Bi-LSTM) and a convolutional neural network (CNN) to effectively detect fake news based on their capability to represent text content. Zhao et al. [2] applied a mixture-of-experts model to fuse the detection features from different domains, so as to achieve better performance.

2.1.2. Graph-Modeling-Based Detection Methods

By constructing documents in the form of graph structures, the relationships between sentences can be explored efficiently. For fake news detection, Vaibhav et al. [10] assigned document sentences as graph nodes to model documents as graph structures and then applied graph attention networks to learn document features. Furthermore, news posted on social media may contain multimodal information, such as textual, user, and temporal data, which can also be used to conduct detection. For example, Nguyen et al. [11] proposed a graphical representation learning method to capture the social context in a high-fidelity representation for fake news. Gangireddy et al. [3] proposed an unsupervised fake news detection method by using graph-based mining techniques. It includes three stages to conduct full dataset labeling.

However, text-content-based detection methods only consider the textual and semantic information of news content but ignore the rich factual information from external knowledge bases.

2.2. Social-Network-Based Detection Methods

Detection methods based on social network data judge the authenticity of news by leveraging user behavior data and propagation characteristics from social media. For example, the responses of a source post have different activity patterns for trusted news and fake news. If a post contains false information, some direct responses express the opposite stance. Otherwise, the response posts are more likely to have supportive opinions. In previous works [12], supporting and opposing relations between responses have been used to helps the analyzers judge news credibility. Social-network-based detection methods can be further divided into two categories: user-stance-analysis-based and propagation-analysis-based detection methods.

2.2.1. User-Stance-Analysis-Based Detection Methods

The attitudes and perspectives of social media users towards news content are used to verify the authenticity of news. This requires the mining and analysis of users’ opinions on news content. For example, some researchers have analyzed users’ responses to determine their stance on the news and then calculated the distribution of stance features to detect fake news. Specifically, if the stances of users’ comments regarding a news document are relatively balanced, this news is more likely to be trustworthy. In contrast, if the stances of users’ comments on a news document contain extreme views, this news should be classified as fake news [13]. Wu et al. [14] proposed a detection method based on multi-task learning that extracts common and specific features of news content, and learns stance information simultaneously. More recently, Yuan et al. [4] claimed that it is possible to conduct explainable fake news detection by considering stance information.

2.2.2. Propagation-Analysis-Based Detection Methods

This type of detection method relies on the basic assumption that the credibility of a news document is strongly related to the credibility of its associated social media posts. Jin et al. [12] constructed heterogeneous credibility networks, respectively, to model the dissemination process of social media information. Zhang et al. [15] constructed a heterogeneous network for news articles, authorship, and news themes, where a deep diffusive network was designed to integrate this information for fake news detection. Moreover, fake news usually spreads through social media, text messages, or emails at a fast speed. Therefore, based on propagation analysis, fake news can be exposed by considering the speed and scope of news dissemination using deep learning techniques. Other works, such as [16], have been proposed to model news dissemination chains by leveraging graph convolutional networks. Specifically, Dou et al. [17] proposed a user-preference-aware fake news detection framework to integrate the dissemination chains of news and related posts. Silva et al. [5] proposed a feature embedding method for news propagation chains that enabled the early detection of fake news by reconstructing the knowledge of complete propagation networks with partial propagation networks.

Due to data privacy and heterogeneous data structures, social-network-based detection methods may become invalid to deploy on different social network platforms in practical applications.

2.3. External-Knowledge-Based Detection Methods

Differing from the above-mentioned detection methods, researchers have leveraged external knowledge to determine whether news content should be trusted. It is challenging to explore the meanings of news content comprehensively by using only text semantic information, since the text content is highly condensed and has problems such as polysemy and abbreviations when the introduction of acronyms is missed. Therefore, external knowledge has been used to improve the detection capability of fake news, in combination with text information. Zhang et al. [18] proposed a multimodal knowledge-aware event memory network for rumor detection. Specifically, a knowledge-aware network was constructed to borrow external knowledge from real-world knowledge graphs as additional evidence. In addition, they designed an event memory network to obtain event-invariant features as a reference to achieve more robust representations. In [19], a knowledge-graph-enhanced framework was designed to detect fake news and provide relational explanations. The extracted graph embeddings were combined with a graph convolutional network to obtain the detection results. Li et al. [20] utilized both objective facts and subjective views to detect fake news by constructing heterogeneous graph objects. Although external knowledge helps to obtain more reliable detection results by considering various relationships of objects in the knowledge graph, the fusion strategy of text information and external knowledge is still an open issue.

2.4. Comparison between Different Types of Detection Methods

To better conduct a comparison between different types of detection methods, a summary of the benefits and drawbacks of existing detection methods is presented in Table 1.

3. Fake News Detection Framework Based on Knowledge-Guided Semantic Analysis

In this section, we detail the proposed detection framework based on knowledge-guided semantic analysis, which compares the news to external knowledge for exposing fake news. As shown in Figure 1, to fully leverage external knowledge, we take the triplets as the bridge between news documents and knowledge graphs. Specifically, a text semantics embedding subnetwork is constructed based on a pre-trained BERT model to encode the semantics of the news document. In addition, for knowledge embedding, a triplet aggregation module is designed to extract the document-level knowledge representation as the guidance for text semantics analysis, where text and knowledge embeddings are compared to explore the inconsistency caused by false information. Furthermore, a triplet scoring module is constructed to measure rationality in view of general knowledge, which can be treated as a complementary detection clue. Finally, rationality scores in different aspects are combined for fake news detection.

3.1. The Preprocessing Operations of Extracting Triplets

In this work, we take the triplets as the bridge between news documents and knowledge graphs. The preliminary operations of extracting triplets from the news document are presented in this section.

Due to information redundancy and the unfixed length of the input document, it is difficult to directly link text content to structured features. Therefore, in the first step, the triplet extraction algorithm is applied to obtain triplets of the input document, where a triplet consists of a head entity, a relation, and a tail entity. For example, for an input sentence “Gracia is bordered by the districts of Eixample to the south.”, a head entity “Gracia”, a relation “borders with”, and a tail entity “Eixample” can be obtained by using the triplet extraction technique. Since triplet extraction is not the focus of this work, a mainstream triplet extraction algorithm REBEL (Relation Extraction By End-to-end Language generation) [21] is selected. Specifically, REBEL tackles relation extraction and classification as a generation task, where an autoregressive model is constructed to output triplets for the input text, and BART (Bidirectional and Auto-Regressive Transformers) [22] is employed as its base model.

3.2. Triplet Alignment Module

The details of constructing the knowledge graph and conducting triplet alignment are presented in this section.

3.2.1. Construction of the Knowledge Graph

The basic step of leveraging external knowledge is to represent external knowledge as structured features, which can be combined with text features to conduct the comparing process. For external-knowledge-based methods of different tasks, the knowledge graph is widely used to provide the structured organization of external knowledge, which can be used in the subsequent knowledge embedding process. We utilize the CommonSense Knowledge Graph (CSKG) [23] to align extracted triplets from the input document, since CSKG is a comprehensive aggregation of various common sense knowledge sources, which can serve as a varied set of reference points when assessing the veracity of news content by comparing with the text content of the input news document. For example, given an extracted relation “fields of this occupation” by the REBEL method, its corresponding element in the CSKG is presented in a different form, namely, “/r/dbpedia/occupation”, where “r” denotes that it is a relation element; “dbpedia” denotes that the knowledge base “DBpedia” is considered; and “occupation” denotes the matched relation. Since we do not focus on developing a new strategy to construct a knowledge graph, the officially released CSKG [23] is applied in this work. The details of constructing CSKGs are reported in [23].

3.2.2. A Fuzzy-Matching-Based Triplet Alignment Method

For triplets from different sources, it is necessary to conduct triplet alignment for equivalent elements presented in various forms. Existing works, such as [24], manually labeled entity information or mapped it to the knowledge graph using entity linking tools. However, due to the diversity of triplet extraction strategies and data sources, equivalent entities or relations may have different forms, which lead to inefficiency in leveraging external knowledge to guide fake news detection.

Unlike existing detection methods which apply data-driven alignment methods, a fuzzy-matching-based triplet alignment technique is proposed in this work which can match elements of triplets between the news document and the knowledge graph efficiently. For a document D, assume it has undergone the format cleaning process, where the text content does not contain special symbols such as emoticons, web links, special symbols, and non-English characters. For text content which contains special symbols, regular expression can be used to filter these special symbols. After extracting triplets, triplets of this document are denoted as

H_{D} = \{(h_{1}, r_{1}, t_{1}), (h_{2}, r_{2}, t_{2}), \dots, (h_{N_{D}}, r_{N_{D}}, t_{N_{D}})\}

.

N_{D}

denotes the number of extracted triplets from the document and

(h_{i}, r_{i}, t_{i})

denotes the ith triplet of the document (

i \in {1, 2, \dots, N_{D}}

), which includes a head entity

h_{i}

, a relation

r_{i}

, and a tail entity

t_{i}

. Our fuzzy-matching-based triplet alignment method is constructed based on the Levenshtein distance. The Levenshtein distance is a metric used to measure the difference between two strings, which has been widely used for various tasks, such as spelling checks and speech recognition. It has a high computational efficiency and promising flexibility when applied to different data sources. It is defined as the minimum number of editing operations required to convert one string to another string. Editing operations include inserting, deleting, and replacing a character. Therefore, our fuzzy matching strategy of triplets can be formulated as follows.

x_{index} = \underset{x_{i} \in H_{K G}}{argmin} \{d (x_{D}, x_{i})\}

(1)

d (s_{a}, s_{b}) = \underset{N \in N}{argmin} \{s_{b} = O_{1} (O_{2} (\dots O_{N} (s_{a})))\}

(2)

O_{n} \in {INSERT (\cdot), DELETE (\cdot), REPLACE (\cdot)}

(3)

where

x_{D} \in H_{D}

denotes the element of a triplet from the input document;

x_{i}

denotes the element from a knowledge graph

H_{K G}

;

x_{index}

denotes the matched element;

d (\cdot, \cdot)

denotes the function to calculate the Levenshtein distance between two strings

s_{a}

and

s_{b}

;

O_{n} (\cdot)

denotes the nth operation to modify the string; and

INSERT (\cdot)

,

DELETE (\cdot)

, and

REPLACE (\cdot)

denote inserting, deleting, and replacing operations of a character, respectively. In practical implementation, the Levenshtein distance can be calculated based on the dynamic programming algorithm [25]. For an input element of a triplet, we only consider the matched element with the minimum Levenshtein distance as the output. If several matched elements have an identical Levenshtein distance, we consider the element whose length is closest to the input element as the final result. Please note, boosting the matching speed in the knowledge graph by shrinking the search space is still an open issue, where some strategies can be applied in future work, such as using pre-constructed dictionaries of knowledge bases to obtain a set of candidate elements (e.g., entities). To illustrate the triplet matching processes in the knowledge graph more clearly, an example is presented in Figure 2.

Based on this triplet alignment process, the element of a triplet (e.g., entity) from a news document can be matched to the most related element in a knowledge graph. As shown in Figure 3, we provide several examples of alignment results based on our strategy. Then,

H_{Index}

of the input document is used for the subsequent knowledge graph semantic analysis subnetwork.

3.3. A Dual-Branch Network for Fake News Detection Based on Knowledge-Guided Semantic Analysis

In this section, the proposed dual-branch network for fake news detection based on knowledge-guided semantic analysis is presented in detail, where two subnetworks are constructed to interact and fuse text semantics and external knowledge to expose the irrationality of fake news.

3.3.1. Text Semantic Embedding Subnetwork

The efficient extraction of text embeddings is necessary to present contextual information for the input document. In the fake news detection task, the contextual and bidirectional information between texts is useful to encode the text semantics. Therefore, we apply the BERT model [26] as the text embedding subnetwork to perform the feature analysis at the text level. For an input document, the tokenizer of the BERT model will process the input document to output a sequence of tokens, where the number of tokens is fixed, such as 512 (the default setting of the original BERT model [27]). This tokens’ sequence will be fed into the BERT model to obtain the text embedding. Specifically, the BERT model is constructed by stacking multiple layers of bidirectional transformer encoders, where this structure results in the beneficial capability of handling more complicated text tasks. The transformer structure of BERT enables it to generate deep representations that integrate contextual and bidirectional information between texts. The pre-trained BERT model introduced in [27] is used to achieve promising performance in word embedding during fake news detection, which can be presented as Equation (4):

V_{D} = BERT (Tokenizer (D))

(4)

where D denotes the input document;

Tokenizer (\cdot)

denotes the token extraction operation;

BERT (\cdot)

denotes the pre-trained BERT model; and

V_{D}

is the output text representation. Please note, in the proposed detection method, the BERT model is used to extract text features while the BART is used by the REBEL algorithm to extract triplets from the input document. The BERT model is efficient for discriminative tasks, such as text classification, while the other is better for generative tasks, such as text generation and text reconstruction.

3.3.2. Knowledge Graph Semantic Analysis Subnetwork

To conduct semantic analysis with knowledge-level clues, two modules are designed for the knowledge graph semantic analysis subnetwork, including a triplet embedding aggregation module and a triplet scoring module, which aim to efficiently conduct interaction and fusion between text semantics and external knowledge.

1.: The training process of general knowledge embedding: For a given knowledge graph $G = \{(h_{i}, r_{i}, t_{i}) ∣ h_{i}, t_{i} \in E, r_{i} \in R\}$ (where $E$ and $R$ denote the set of entities and relations in the knowledge graph), the mapping function of elements in the knowledge graph to feature vectors can be obtained by training a knowledge embedding model, such as TransE [28] used in this work, where the dimension of the knowledge representation is set as k.
2.: Triplet aggregation module based on token and channel mixing mechanism: Instead of constructing the graph-based learning procedure of document-level external knowledge representations like existing methods, we designed a triplet aggregation module based on a token and channel mixing mechanism. It can fully conduct the interaction of knowledge representations within and across different triplets, which can obtain a better document-level knowledge representation to guide the analysis of text semantics.

Specifically, after preprocessing operations, the trained knowledge embedding model and matched triplets of the input document to the knowledge graph can be obtained. For each triplet

(h_{i}, r_{i}, t_{i})

, we can obtain the corresponding knowledge representation

(E_{i}^{h}, E_{i}^{r}, E_{i}^{t})

by applying the trained knowledge embedding model. The final triplet representation

E_{i}^{h r t} \in R^{3 k}

can be calculated by concatenating

E_{i}^{h}

,

E_{i}^{r}

, and

E_{i}^{t}

, where

E_{i}^{h}

,

E_{i}^{r}

, and

E_{i}^{t}

denote the knowledge representations of

h_{i}

,

r_{i}

, and

t_{i}

, respectively. Consequently, for the input document, the set of knowledge representations can be obtained, namely,

X = \{E_{1}^{h r t}, E_{2}^{h r t}, \dots, E_{N_{D}}^{h r t}\} \in R^{N_{D} \times C_{e}}

, where

C_{e}

denotes the numbers of channels and

C_{e} = 3 k

.

In the field of computer vision, neural networks constructed by only applying the structure of multi-layer perception (MLP) have been proved to be efficient for enabling the interaction of different input dimensions [29]. Inspired by this technique, our triplet embedding aggregation module is constructed by using the MLP-mixer structure [29]. The MLP-mixer layer can conduct information fusion across all embedded channels and triplets in the feature space by applying channel mixing MLP and token mixing MLP. Specifically, the mixer contains multiple layers of identical size, and each layer includes two MLP blocks. The first MLP block conducts token-mixing, and then the second one conducts channel-mixing, as shown in Figure 4. This process of each MLP-mixer layer can be formulated as follows:

U_{*, i} = X_{*, i} + W_{2} δ (W_{1} N o r m {(X)}_{*, i}), for i = 1, . . ., C_{e} .

(5)

Y_{j, *} = U_{j, *} + W_{4} δ (W_{3} N o r m {(U)}_{j, *}), for j = 1, . . ., N_{D} .

(6)

where

X_{*, i}

denotes the elements at the ith channel of different triplets for a document;

U_{j, *}

denotes the output feature of the jth triplet across different channels;

δ (\cdot)

is the non-linear function GELU (Gaussian Error Linear Unit) [30] and

N o r m (\cdot)

is the layer normalization [31];

W_{i}

(

i \in {1, 2, 3, 4}

) denotes network parameters of fully connected layers; and

Y

denotes the output of an MLP-mixer layer. In this way, the token mixing MLP layer considers interactions between different triplets, while the channel mixing MLP layer can interact with per-triplet features across different channels. These two types of mixing MLP layers can enable the fusion of the two input dimensions.

After obtaining the output

Y

using the MLP mixing operation, it is fed into a fully connected layer to obtain the document-level knowledge representation:

Y_{D} = R e L U (W Y + b)

(7)

where

W

and

b

are the learnable weight parameters and bias parameters, and

R e L U (\cdot)

denotes the non-linear function ReLU (Rectified Linear Unit).

After the process of the triplet aggregation module, the 128-dimensional document-level knowledge representation

Y_{D}

can be obtained. Then, it is concatenated with the text representation

V_{D}

from the text embedding subnetwork, which can be treated as the guidance of text semantics and provides the basis of the subsequent comparison of text and knowledge embeddings in Section 3.4.

Z_{D} = Y_{D} \oplus V_{D}

(8)

where

Z_{D}

denotes the detection feature and ⊕ denotes the concatenation operation.

3.: Triplet scoring module based on rationality measurement: To further enhance the detection capability based on text semantics and external knowledge, we construct a triplet scoring module based on the rationality measurement. This can measure irrationality in view of general knowledge as a complementary clue, which considers the relationship of entities and relations in the knowledge embedding space.

To design this module, we consider that the important assumption of knowledge embedding is that for each group of triplets

(h, r, t)

, two entities and a relation should satisfy a vector constraint for their representations

(E^{h} + E^{r} \approx E^{t})

. Following this constraint, a smaller distance between the head entity and the tail entity in the feature space infers a more reasonable triplet. The objective function of knowledge embedding in the training phase can be formulated as follows:

L = \sum_{(h, r, t) \in S, (h^{'}, r, t^{'}) \in S^{'}} {[γ + d_{m} (h, r, t) - d_{m} (h^{'}, r, t^{'})]}_{+}

(9)

where

d_{m} (\cdot)

is a distance measurement function, where

d_{m} (h, r, t) = {∥ E^{h} + E^{r} - E^{t} ∥}_{2}

and

{[\cdot]}_{+}

denotes a function of taking the positive part. In the above equation, S is the set of triplets, and

S^{'}

is the set of negative samples obtained by randomly replacing the head and tail entities in the triplets, and

γ

is a threshold to control distance.

Therefore, for

(E_{i}^{h}, E_{i}^{r}, E_{i}^{t})

, the rationality score is calculated as follows, where a smaller value of the score denotes better rationality of the triplet.

s_{i} = {∥E_{i}^{h} + E_{i}^{r} - E_{i}^{t}∥}_{2}

(10)

In the task of fake news detection, the measurement of the document-level rationality is essential. After conducting the scoring operation for all triplets of a document, we can obtain the scoring sequence

S_{r} = \{s_{1}, s_{2}, \dots, s_{N_{D}}\}

. Then, the scoring sequence of a document is fed into a multi-layer perception, which includes three fully connected layers and then a sigmoid function is followed for the score aggregation, where the numbers of neurons in the fully connected layers are 64, 32, and 1, respectively. It calculates the rationality score in view of general knowledge:

t_{1} = D r o p o u t (R e L U (W_{t_{1}} S_{r} + b_{t_{1}}))

(11)

t_{2} = D r o p o u t (R e L U (W_{t_{2}} t_{1} + b_{t_{2}}))

(12)

p_{h r t} = S i g m o i d (W_{t_{3}} t_{2} + b_{t_{3}})

(13)

where

D r o p o u t (\cdot)

denotes the dropout function and

R e L U (\cdot)

denotes the non-linear activation function ReLU; and

W_{t_{i}}

and

b_{t_{i}}

denote network parameters of the fully connected layers,

i \in {1, 2, 3}

. For an input x, the sigmoid function (

S i g m o i d (\cdot)

) is calculated as follows:

S i g m o i d (x) = \frac{1}{1 + exp (- x)}

(14)

The final output

p_{h r t}

is the rationality score based on external knowledge, which can measure the rationality of a document from a general knowledge perspective.

3.4. The Interaction Module of Text Semantic and General Knowledge

Finally, the interaction is conducted on rationality scores of both general knowledge and text semantics guided by external knowledge with learnable weights. For the detection feature

Z_{D}

, which contains text semantics guided by external knowledge, we further construct a multi-layer perception (

M L P_{t e x t}

) by stacking three fully connected layers followed by a sigmoid function, where the numbers of neurons in the fully connected layers are 256, 128, and 128, respectively. This can obtain the rationality score of knowledge-guided text semantics, as shown in the following equation.

p_{t e x t} = S i g m o i d (M L P_{t e x t} (Z_{D}))

(15)

In other words, text and knowledge embeddings are compared by this multi-layer perceptron in a data-driven manner, where

Z_{D} = Y_{D} \oplus V_{D}

, as shown in Equation (8). This rationality score

p_{t e x t}

is a measurement that complements

p_{h r t}

in terms of general knowledge. Therefore, to enable the interaction between text semantics and general knowledge, the outputs of the two subnetworks are integrated, allowing the final fake news detection result to be obtained. In the proposed detection framework, we construct an interaction module between text semantics and general knowledge, which has an efficient multi-layer perceptron architecture.

As shown in Figure 5, the interaction between

p_{t e x t}

and

p_{h r t}

can be expressed by a weighted summation:

p_{p r e d i c t} = g \times p_{t e x t} + (1 - g) \times p_{h r t}

(16)

where g is the learnable weight, which can be obtained through the model training process. After processing by this interaction module, we can obtain the final detection result

p_{p r e d i c t}

. Then,

p_{p r e d i c t}

is compared with the pre-defined threshold

T_{p}

. If

p_{p r e d i c t} > T_{p}

, the input document is fake news. Otherwise, it is real news.

3.5. The Design of the Loss Function

Generally, the fake news detection task can be treated as a binary classification problem. Therefore, the binary cross-entropy loss is applied as the objective function for parameter optimization during the training phase. Moreover, to mitigate the risk of overfitting, L2 regularization for network parameters is also considered, as shown in Equation (17):

L (p_{p r e d i c t}, y) = - [y log (p_{p r e d i c t}) + (1 - y) log (1 - p_{p r e d i c t})] + λ {∥ w ∥}_{2}^{2}

(17)

where

λ

denotes the regularization parameter and

w

denotes the set of model parameters of the detection network.

y \in \{0, 1\}

is the label of the input document. The labels 0 and 1 denote the document D belonging to real and fake news, respectively.

4. Experiments

In this section, the performance of the proposed method is evaluated and compared with that of other state-of-the-art fake news detection methods (The code is available at https://github.com/Crimson725/fake_news_detection, accessed on 2 January 2024). An ablation study is also conducted to analyze different components of the proposed method.

4.1. Fake News Dataset

In the experiments, two publicly accessible datasets are used: LUN (Labeled Unreliable News Dataset) [32] and SLN (Satirical and Legitimate News Database) [33].

Specifically, the LUN dataset is a large fake news dataset that contains a total of 74,476 news documents. Among them, 13,995 real news articles were collected from the Gigaword corpus, while 60,481 fake news articles were divided into three categories: satire, hoax, and propaganda. Fake news was collected from several fake news media platforms, including “The Onion”. As in [24], for the LUN dataset, we only consider the satire-type news documents from “The Onion” as fake news.

On the other hand, the SLN dataset contains 360 news articles, where 180 fake news articles were collected from the fake news media platform “The Onion” and the satirical news media platform “The Beaverton”, while the remaining 180 real news articles were collected from established news media platforms, namely, “The New York Times” and “The Toronto Star”. In this dataset, each piece of fake news has a corresponding real news item on the same topic. The SLN dataset contains four categories of topics: people’s livelihood, science, business, and other news. Moreover, these four categories of topics are divided into 12 sub-topics.

Following the setting in [24], the LUN-train dataset is used for training, the LUN-test dataset is used as the verification set, and the SLN dataset is used for the performance evaluation. This setting is considered to emulate a real-world scenario where the performance of detection models on an out-of-domain dataset can be analyzed.

Evaluation Metrics

In the experiments, accuracy, recall, and macro-F1 are considered as the evaluation metrics. Among these metrics, “accuracy” indicates the proportion of the number of samples correctly predicted by the detection model to the total number of samples. The recall rate (recall) indicates the proportion of positive samples predicted correctly by a detection model compared to the actual positive samples. A higher value for the recall rate indicates better detection capability. Macro-F1 is the harmonic mean of recall and precision (precision denotes the proportion of positive samples predicted correctly to samples classified as positive samples), which can describe the performance of the classifier, especially when the numbers of samples from different categories are imbalanced.

The calculation process of macro-F1 can be formulated as Equation (18) when the number of categories for classification is n:

F 1_{m a c r o} = \frac{1}{n} \sum_{i = 1}^{n} F 1_{i}

(18)

where

F 1_{i}

denotes the F1 score of the ith category.

4.2. Experimental Settings

For the training phase, the maximum number of epochs is set as 15. The batch size is 12. The parameter optimization of the detection network is conducted using the Adam optimizer, where the learning rate is set as 1 ×

10^{- 5}

. The dimension of the text embedding from a BERT model is 768. The number of triplets extracted from a document is set as 20. The TransE algorithm is trained on the CommonSense Knowledge Graph [23]. The detection model is implemented by the PyTorch library. The experiments are conducted on a device equipped with an Intel(R) Xeon(R) using 4210 CPU and RTX 3090 GPU.

4.3. Comparison Experiment

In the comparison experiment, we first consider some basic deep neural models for sentence classification to detect fake news, including CNN [34] and LSTM [35]. More specifically, convolutional neural networks are trained on top of pre-trained word vectors, while a new neural network model based on LSTM is proposed to capture enough sentiment messages from relatively long time steps [35]. Both of them conducted fake news detection by only leveraging text content.

In addition, more advanced fake news detection methods cooperating with external knowledge are used for comparison, including Knowledge-driven Multimodal Graph Convolutional Network (KMGCN) [36], Dual Co-Attention Network (Dual-CAN) [37], and Sentiment Mixed Heterogeneous Network (SMHN) [24]. Specifically, to make use of the background knowledge, KMGCN converts word sequences and knowledge concepts into a graph and constructs a graph convolutional network to detect fake news. In experiments, the version of KMGCN without visual information is considered. Dual-CAN considers news content and external knowledge with a co-attention module to detect fake news. Due to the network structure and the scale of the dataset used in the experiments, we consider the Dual-CAN model using news text content and extracted entities. SMHN constructs a graph attention network to capture sentiment information, such as emotional change, and cooperates with knowledge representations using LSTM to detect fake news. Experimental results are presented in Table 2, where the best results are bold.

From Table 2, it can be observed that the proposed method can achieve a distinct improvement compared with the other state-of-the-art methods. For example, the recall of the proposed method is higher than that of the other methods by more than 6%. Moreover, detection models only relying on text data, such as CNN and LSTM, show significantly worse performance than detection methods with external knowledge. This suggests that introducing external knowledge is necessary to obtain more reliable fake news detection results. Although KMGCN, Dual-CAN, and SMHN considered text semantics and external knowledge, the improvement in detection capability is distinctly worse than the proposed method. This verifies the effectiveness of a knowledge-guided semantic analysis framework to expose fake news, which can reveal irrational content efficiently even when the training and testing sets are constructed from different sources of news documents. In addition, the carefully designed interaction and fusion strategies between text semantics and external knowledge are also essential for fake news detection.

We further evaluate the detection performance of the proposed method and SMHN on trusted documents and hoax documents of the LUN dataset. The samples in the LUN-train are randomly divided into two subsets, with a ratio of 9:1 for training and validation. Then, detection methods are evaluated on samples of the LUN-test dataset. Other experimental settings remain unchanged. According to the detection results, the proposed method still outperforms SMHN with a distinct margin, where the accuracies of the proposed method and SMHN are 89.91% and 83.73%, respectively. It is unsurprising that hoaxes are more challenging to identify, since hoax-type news is written to convince the reader of the validity of a story, where the writer is trying to present the story in a normal (truthful) way similar to a real story. Although we have obtained a promising detection result by considering external knowledge based on the common sense knowledge graph, the detection capability may be degraded for news documents containing technical topics and professional details. The potential solution is to consider the topic of the input news document and construct a knowledge graph with specific domain knowledge to obtain better detection performance against scenarios mentioned above in future work.

In addition, we also evaluate the computational times of different detection methods, including one text-content-based detection method (LSTM) and two external-knowledge-based detection methods (SMHN and the proposed method). Two hundred samples are randomly selected from the training dataset, where the average length of documents is about 400 words. The average computational times of LSTM, SMHN, and our method are 0.065 s, 1.257 s, and 0.325 s respectively. It is not surprising that text-content-based detection method has outstanding computational efficiency since no external knowledge is used. The proposed detection method has much better time efficiency than SMHN due to the lightweight design of the network architecture and the efficient triplet alignment strategy. Moreover, sacrificing time efficiency to consider external knowledge is valuable to achieve more reliable detection results.

4.4. Performance Analysis with Different Text Embedding Networks

In the proposed method, the text embedding network (referred to as TEN) plays an essential role in obtaining the representation of text semantics of the input news document. It is important to verify the efficiency of the selected network. Therefore, the influence of applying different text embedding subnetworks is analyzed in this experiment. Specifically, the performance of the BERT model used in our text embedding subnetwork is evaluated and compared with two other widely used word embedding models, including GloVe [38] and FastText [39].

The modified detection networks with different text embedding models are presented as follows:

1.: TEN-GloVe: In the text embedding subnetwork, the BERT model is replaced by GloVe, which can obtain the representation vector of each word in the document and take the average value as the text semantic embedding of the document.
2.: TEN-FastText: This is the same as TF-GloVe, but uses FastText to obtain the representation vector of each word in the document.
3.: Proposed method: The proposed detection model in this work.

The other experimental settings are identical to those in Section 4.2. The experimental results are shown in Table 3.

It can be observed that applying BERT as the text embedding subnetwork can obtain better performance with a clear margin. For example, the recall of the proposed method is higher than other word embedding models by more than 14%. These experimental results are unsurprising since GloVe and FastText are more traditional word embedding models. On the other hand, the BERT model leverages several advanced techniques, such as the transformer structure, which can consider the semantic features of text context sufficiently. The experimental results in this section demonstrate that the selection of a proper text embedding subnetwork is important for the fake news detection task. By applying a more powerful text semantics subnetwork, which can consider more sufficient contextual information of text content, the performance can be improved distinctly. This implies that the detection capability can be further enhanced by introducing better text embedding subnetworks in future work.

4.5. Performance Analysis with Different Knowledge Embedding Networks

In this experiment, the significance of the knowledge embedding network (referred to as KEN) is evaluated, which consists of a triplet embedding aggregation module and a triplet scoring module. The modified versions of the proposed framework are presented as follows:

1.: KEN\tri_a: In this case, the triplet embedding aggregation module is removed from the knowledge embedding subnetwork. Specifically, the text semantic embedding and the document triplet scores obtained from the triplet scoring module are used to detect fake news.
2.: KEN\tri_s: In this case, the triplet scoring module is removed. Specifically, the text semantic representation and the aggregation representation obtained by the triplet embedding aggregation module are concatenated as a detection feature $Z_{D}$ to expose fake news, which contains rich text semantics guided by external knowledge.
3.: Non-KEN: The knowledge embedding subnetwork is removed completely; the fake news detection is conducted by only applying the text embedding subnetwork.
4.: Proposed method: The proposed dual-branch model based on knowledge-guided semantic analysis.

Other experimental settings are identical to those in Section 4.2. The experimental results are shown in Table 4.

As shown in Table 4, the proposed method can achieve the best performance for all metrics with clear margins. The following conclusions can be drawn:

1.: When removing the triplet aggregation module (KEN\tri_a), the detection model is inefficient in capturing the knowledge information about triplets in the whole document, which results in a performance drop.
2.: By removing the triplet scoring module (KEN\tri_s), the detection model cannot measure the rationality of the entire document, leading to performance degradation.
3.: The non-KEN model suffers from a distinct performance drop since the knowledge embedding is ignored in this case. This implies that only applying the text semantic embedding is insufficient to expose fake news precisely.

In summary, this ablation study about different components of the knowledge embedding subnetwork demonstrates that the general knowledge from the knowledge graph is significant for fake news detection, and a proper embedding method should be constructed to leverage external knowledge.

4.6. Performance Evaluation on Different Fake News Datasets

In the previous experiments, the LUN and SLN datasets were applied to train and evaluate the detection models. The fake news and real news in these two datasets are derived from conventional news reports, which have long pieces of text, a plain writing style, and less subjective content. However, in practical applications, e.g., social networks, misinformation may be spread in the form of “short text”.

Therefore, it is important to evaluate the performance of the proposed method for short text. In this experiment, two commonly used fake news datasets on social media are considered: Twitter15 and Twitter16 [40]. We only consider the text content in this experiment. According to the protocol given in [41], the training and testing subsets for Twitter15 and Twitter16 are obtained. Experimental settings are identical to those in Section 4.2. The results are presented in Table 5.

As shown in Table 5, the proposed method still can obtain promising performance for fake information in the form of “short text”. It should be noted that text samples in Twitter15 and Twitter16 have short pieces of content and unfixed formats, in which it is difficult to detect fake news content precisely when only leveraging text semantics embedding. The proposed method can detect fake news with different text styles by considering both text semantics and external knowledge.

5. Conclusions and Future Work

In this paper, we focus on building an efficient detection framework for fake news by leveraging external knowledge. We find that the guidance and interaction strategies of external knowledge with text semantics are crucial for improving detection performance. To address this issue, a fuzzy-matching-based entity alignment technique is designed to boost the triplet alignment process. In addition, to measure the rationality in view of text semantics, a triplet aggregation operation based on a token and channel mixing mechanism is proposed to guide the text semantic analysis, which can fully conduct interactions between knowledge representations within and across different triplets. Furthermore, we design a triplet scoring module based on the relationships of entities and relations in the knowledge embedding space to measure the rationality in view of general knowledge. Experimental results on different fake news datasets verify that our proposed detection method is efficient in exposing fake news when compared with other advanced methods. In future work, for fake news with short text, such as posts on social networks, we plan to extend our proposed detection method by leveraging data with multiple modalities, such as text and image content, where the semantic consistency of data in different modalities can also be treated as important clues for exposing fake news.

Author Contributions

Conceptualization, W.Z. and Z.Z.; methodology, W.Z., P.H. and Z.Z.; software, X.X. and Z.Z.; validation, X.X.; formal analysis, P.H. and X.X.; investigation, W.Z. and P.H.; resources, W.Z., P.H. and Z.Z.; data curation, X.X. and Z.Z.; writing—original draft preparation, W.Z. and Z.Z.; writing—review and editing, P.H.; supervision, P.H.; project administration, X.X. and P.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in this article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Guo, B.; Ding, Y.; Yao, L.; Liang, Y.; Yu, Z. The future of false information detection on social media: New perspectives and trends. ACM Comput. Surv. 2020, 53, 1–36. [Google Scholar] [CrossRef]
Zhao, J.; Zhao, Z.; Shi, L.; Kuang, Z.; Liu, Y. Collaborative mixture-of-experts model for multi-domain fake news detection. Electronics 2023, 12, 3440. [Google Scholar] [CrossRef]
Gangireddy, S.C.R.; P, D.; Long, C.; Chakraborty, T. Unsupervised fake news detection: A graph-based approach. In Proceedings of the HT ’20: 31st ACM Conference on Hypertext and Social Media, Virtual Event, 13–15 July 2020; pp. 75–83. [Google Scholar]
Yuan, L.; Shen, H.; Shi, L.; Cheng, N.; Jiang, H. An explainable fake news analysis method with stance information. Electronics 2023, 12, 3367. [Google Scholar] [CrossRef]
Silva, A.; Han, Y.; Luo, L.; Karunasekera, S.; Leckie, C. Propagation2Vec: Embedding partial propagation networks for explainable fake news early detection. Inf. Process. Manag. 2021, 58, 102618. [Google Scholar] [CrossRef]
Hu, L.; Yang, T.; Zhang, L.; Zhong, W.; Tang, D.; Shi, C.; Duan, N.; Zhou, M. Compare to the knowledge: Graph neural fake news detection with external knowledge. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Virtual Event, 1–6 August 2021; pp. 754–763. [Google Scholar]
Hu, L.; Wei, S.; Zhao, Z.; Wu, B. Deep learning for fake news detection: A comprehensive survey. AI Open 2022, 3, 133–155. [Google Scholar] [CrossRef]
Potthast, M.; Kiesel, J.; Reinartz, K.; Bevendorff, J.; Stein, B. A stylometric inquiry into hyperpartisan and fake news. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 231–240. [Google Scholar]
Kong, S.H.; Tan, L.M.; Gan, K.H.; Samsudin, N.H. Fake news detection using deep learning. In Proceedings of the IEEE 10th Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang, Malaysia, 18–19 April 2020; pp. 102–107. [Google Scholar]
Vaibhav, V.; Annasamy, R.M.; Hovy, E.H. Do sentence interactions matter? leveraging sentence level representations for fake news classification. In Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing, Hong Kong, China, 4 November 2019; pp. 134–139. [Google Scholar]
Nguyen, V.; Sugiyama, K.; Nakov, P.; Kan, M. FANG: Leveraging social context for fake news detection using graph representation. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management, Virtual Event, 19–23 October 2020; pp. 1165–1174. [Google Scholar]
Jin, Z.; Cao, J.; Zhang, Y.; Luo, J. News Verification by Exploiting Conflicting Social Viewpoints in Microblogs. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2972–2978. [Google Scholar]
Oshikawa, R.; Qian, J.; Wang, W.Y. A survey on natural language processing for fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, Palais du Pharo, France, 11–16 May 2020; pp. 6086–6093. [Google Scholar]
Wu, L.; Rao, Y.; Jin, H.; Nazir, A.; Sun, L. Different absorption from the same sharing: Sifted multi-task learning for fake news detection. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 3–7 November 2019; pp. 4643–4652. [Google Scholar]
Zhang, J.; Dong, B.; Yu, P.S. FakeDetector: Effective fake news detection with deep diffusive neural network. In Proceedings of the 36th IEEE International Conference on Data Engineering, Dallas, TX, USA, 20–24 April 2020; pp. 1826–1829. [Google Scholar]
Bian, T.; Xiao, X.; Xu, T.; Zhao, P.; Huang, W.; Rong, Y.; Huang, J. Rumor detection on social media with bi-directional graph convolutional networks. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 549–556. [Google Scholar]
Dou, Y.; Shu, K.; Xia, C.; Yu, P.S.; Sun, L. User preference-aware fake news detection. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 11–15 July 2021; pp. 2051–2055. [Google Scholar]
Zhang, H.; Fang, Q.; Qian, S.; Xu, C. Multi-modal knowledge-aware event memory network for social media rumor detection. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 1942–1951. [Google Scholar]
Wu, K.; Yuan, X.; Ning, Y. Incorporating relational knowledge in explainable fake news detection. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Delhi, India, 11–14 May 2021; pp. 403–415. [Google Scholar]
Li, J.; Ni, S.; Kao, H. Meet the truth: Leverage objective facts and subjective views for interpretable rumor detection. In Proceedings of the Findings of the Association for Computational Linguistics, Online, 1–6 August 2021; pp. 705–715. [Google Scholar]
Cabot, P.L.H.; Navigli, R. REBEL: Relation extraction by end-to-end language generation. In Proceedings of the Findings of the Association for Computational Linguistics, Online, 1–6 August 2021; pp. 2370–2381. [Google Scholar]
Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 6–8 July 2019; pp. 7871–7880. [Google Scholar]
Ilievski, F.; Szekely, P.; Zhang, B. Cskg: The commonsense knowledge graph. In Proceedings of the The Semantic Web: 18th International Conference, Virtual Event, 6–10 June 2021; pp. 680–696. [Google Scholar]
Zhang, H.; Li, Z.; Liu, S.; Huang, T.; Ni, Z.; Zhang, J.; Lv, Z. Do sentence-level sentiment interactions matter? sentiment mixed heterogeneous network for fake news detection. IEEE Trans. Comput. Soc. Syst. 2023, 1–11. [Google Scholar] [CrossRef]
Navarro, G. A guided tour to approximate string matching. ACM Comput. Surv. 2001, 33, 31–88. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Turc, I.; Chang, M.; Lee, K.; Toutanova, K. Well-read students learn better: The impact of student initialization on knowledge distillation. arXiv 2019, arXiv:1908.08962. [Google Scholar]
Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 2787–2795. [Google Scholar]
Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. MLP-Mixer: An all-MLP architecture for vision. In Proceedings of the Annual Conference on Neural Information Processing Systems, Online, 6–14 December 2021; pp. 24261–24272. [Google Scholar]
Hendrycks, D.; Gimpel, K. Gaussian error linear units (GELUs). arXiv 2019, arXiv:1911.03925. [Google Scholar]
Ba, L.J.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
Rashkin, H.; Choi, E.; Jang, J.Y.; Volkova, S.; Choi, Y. Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; pp. 2931–2937. [Google Scholar]
Rubin, V.L.; Conroy, N.; Chen, Y.; Cornwell, S. Fake news or truth? using satirical cues to detect potentially misleading news. In Proceedings of the Workshop on Computational Approaches to Deception Detection, Avignon, France, 23–27 April 2016; pp. 7–17. [Google Scholar]
Kim, Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
Rao, G.; Huang, W.; Feng, Z.; Cong, Q. LSTM with sentence representations for document-level sentiment classification. Neurocomputing 2018, 308, 49–57. [Google Scholar] [CrossRef]
Wang, Y.; Qian, S.; Hu, J.; Fang, Q.; Xu, C. Fake news detection via knowledge-driven multimodal graph convolutional networks. In Proceedings of the International Conference on Multimedia Retrieval, Dublin, Ireland, 8–11 June 2020; pp. 540–547. [Google Scholar]
Yang, S.H.; Chen, C.C.; Huang, H.H.; Chen, H.H. Entity-aware dual co-Attention network for fake news detection. arXiv 2023, arXiv:2302.03475. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 3–7 April 2017; pp. 427–431. [Google Scholar]
Ma, J.; Gao, W.; Wong, K. Detect rumors in microblog posts using propagation structure via kernel learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 708–717. [Google Scholar]
Lu, Y.; Li, C. GCAN: Graph-aware co-attention networks for explainable fake news detection on social media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 505–514. [Google Scholar]

Figure 1. The diagram of the proposed fake news detection model based on knowledge-guided semantic analysis.

Figure 2. An example of the triplet extraction and triplet matching processes.

Figure 3. Some examples of alignment results between the document and the knowledge graph.

Figure 4. The MLP-mixer layer, which includes a token mixing block and a channel mixing block.

Figure 5. The interaction module between text semantic and general knowledge.

Table 1. Comparison between different types of detection methods.

Method	Information	Benefit	Drawback
Text-content -based method	Text	High computational efficiency	Hard to deal with synonyms and explore irrationality beyond text information
Social-network -based method	Text and social network data	The expected performance tends to be higher, when user/propagation data are available	May become invalid to deploy due to data privacy
External-knowledge -based method	Text and external knowledge	The expected performance tends to be higher, when external knowledge is available	Require to carefully design the fusion strategy between text and external knowledge

Table 2. Detection performance of comparison experiments (%).

Detection Method	Information	Accuracy	Recall	Macro-F1
CNN	Text	78.93	78.42	79.04
LSTM	Text	80.63	81.00	80.32
KMGCN	Text and EK $^{1}$	85.13	86.27	85.75
Dual-CAN	Text and EK	87.65	87.12	88.35
SMHN	Text and EK	89.12	89.54	89.29
Proposed method	Text and EK	94.45	95.61	94.38

¹: EK denotes external knowledge.

Table 3. The detection performance with different text embedding subnetworks (%).

Model	Accuracy	Recall	Macro-F1
TEN-GloVe	81.67	80.78	81.63
TEN-FastText	82.39	81.22	82.26
Proposed method	94.45	95.61	94.38

Table 4. The detection performance with different knowledge embedding subnetworks (%).

Model	Accuracy	Recall	Macro-F1
KEN\tri_a	91.06	93.11	91.05
KEN\tri_s	92.22	94.78	91.19
Non-KEN	86.28	88.35	87.26
Proposed method	94.45	95.61	94.38

Table 5. Performance evaluation on fake news with short text.

Dataset	Accuracy	Recall	Macro-F1
Twitter15	93.51	93.66	92.83
Twitter16	90.87	88.97	90.24

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, W.; He, P.; Zeng, Z.; Xu, X. Fake News Detection Based on Knowledge-Guided Semantic Analysis. Electronics 2024, 13, 259. https://doi.org/10.3390/electronics13020259

AMA Style

Zhao W, He P, Zeng Z, Xu X. Fake News Detection Based on Knowledge-Guided Semantic Analysis. Electronics. 2024; 13(2):259. https://doi.org/10.3390/electronics13020259

Chicago/Turabian Style

Zhao, Wenbin, Peisong He, Zhixin Zeng, and Xiong Xu. 2024. "Fake News Detection Based on Knowledge-Guided Semantic Analysis" Electronics 13, no. 2: 259. https://doi.org/10.3390/electronics13020259

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fake News Detection Based on Knowledge-Guided Semantic Analysis

Abstract

1. Introduction

2. Related Work

2.1. Text-Content-Based Detection Methods

2.1.1. Text-Feature-Based Detection Methods

2.1.2. Graph-Modeling-Based Detection Methods

2.2. Social-Network-Based Detection Methods

2.2.1. User-Stance-Analysis-Based Detection Methods

2.2.2. Propagation-Analysis-Based Detection Methods

2.3. External-Knowledge-Based Detection Methods

2.4. Comparison between Different Types of Detection Methods

3. Fake News Detection Framework Based on Knowledge-Guided Semantic Analysis

3.1. The Preprocessing Operations of Extracting Triplets

3.2. Triplet Alignment Module

3.2.1. Construction of the Knowledge Graph

3.2.2. A Fuzzy-Matching-Based Triplet Alignment Method

3.3. A Dual-Branch Network for Fake News Detection Based on Knowledge-Guided Semantic Analysis

3.3.1. Text Semantic Embedding Subnetwork

3.3.2. Knowledge Graph Semantic Analysis Subnetwork

3.4. The Interaction Module of Text Semantic and General Knowledge

3.5. The Design of the Loss Function

4. Experiments

4.1. Fake News Dataset

Evaluation Metrics

4.2. Experimental Settings

4.3. Comparison Experiment

4.4. Performance Analysis with Different Text Embedding Networks

4.5. Performance Analysis with Different Knowledge Embedding Networks

4.6. Performance Evaluation on Different Fake News Datasets

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI