Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features

Liu, Yang; Wang, Xingzhi; Liu, Xuemei; Ren, Zehong; Wang, Yize; Cai, Qianqian

doi:10.3390/electronics13152979

Open AccessFeature PaperArticle

Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features

by

Yang Liu

^1,*,

Xingzhi Wang

²,

Xuemei Liu

²,

Zehong Ren

²,

Yize Wang

² and

Qianqian Cai

²

¹

Provincial Collaborative Innovation Center for Efficient Utilization of Water Resources in the Yellow River Basin, North China University of Water Resources and Electric Power, Zhengzhou 450046, China

²

School of Information Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450046, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(15), 2979; https://doi.org/10.3390/electronics13152979 (registering DOI)

Submission received: 18 June 2024 / Revised: 13 July 2024 / Accepted: 25 July 2024 / Published: 28 July 2024

(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

During the joint extraction of entity and relationship from the operational management data of hydraulic engineering, complex sentences containing multiple triplets and overlapping entity relations often arise. However, traditional joint extraction models suffer from a single-feature representation approach, which hampers the effectiveness of entity relation extraction in complex sentences within hydraulic engineering datasets. To address this issue, this study proposes a multi-feature joint entity relation extraction method based on global context mechanism and graph convolutional neural networks. This method builds upon the Bidirectional Encoder Representations from Transformers (BERT) pre-trained model and utilizes a bidirectional gated recurrent unit (BiGRU) and global context mechanism (GCM) to supplement the contextual and global features of sentences. Subsequently, a graph convolutional network (GCN) based on syntactic dependencies is employed to learn inter-word dependency features, enhancing the model’s knowledge representation capabilities for complex sentences. Experimental results demonstrate the effectiveness of the proposed model in the joint extraction task on hydraulic engineering datasets. The precision, recall, and F1-score are 86.5%, 84.1%, and 85.3%, respectively, all outperforming the baseline model.

Keywords:

hydraulic engineering; joint extraction of entity and relation; global context mechanism; graph convolutional neural network

1. Introduction

Conducting research in hydraulic engineering can provide assurance for its safe operation and is of significant importance in addressing various global water resource challenges. The South-to-North Water Diversion Project is one of China’s largest hydraulic engineering projects and is aimed at alleviating water scarcity in northern China. In the operational management of this project, the engineering management office records daily risk data and proposes relevant preventive measures. These Chinese data are typically stored in unstructured text format. Effectively organizing and extracting these textual data to construct a knowledge graph in the field of hydraulic engineering can facilitate efficient knowledge management [1], aiding in addressing operational risks encountered in hydraulic engineering projects. One of the key technologies in constructing a knowledge graph involves effectively extracting structured triples from these unstructured textual data.

The triples required for a knowledge graph typically consist of a pair of entities and the relationship between them. Currently, deep learning-based entity relation extraction methods can be categorized into two types: pipeline methods and joint extraction methods [2,3]. Pipeline methods separate named entity recognition (NER) and relation extraction (RE) into two independent tasks [4]. While straightforward and easy to implement, pipeline methods may overlook the inherent connections between entities and relationships, thus affecting extraction effectiveness. To address the limitations of pipeline methods, researchers have begun exploring approaches that can extract entities and relationships simultaneously.

Compared to pipeline methods, unified extraction methods obtain entity and relationship triples within the same model while considering the semantic information between entities and relationships, which enables more effective utilization of contextual information in text [5]. Wang et al. [6] proposed a joint extraction method combining deep learning models and attention mechanisms. By learning the contextual features of sentences, they achieved good results in the task of entity relation extraction in the military domain. In conclusion, joint extraction methods achieve better results as compared to pipeline methods.

However, there exist some problems that occur when some existing joint extraction models are directly applied to the field of hydraulic engineering. In the dataset of hydraulic engineering operation management used in this study, some sentences have complex situations of multiple triples and overlapping triples. However, due to the single feature representation and insufficient knowledge representation ability of the existing joint extraction model, it is difficult to completely extract all the triples of the hydraulic engineering dataset. Reference [7] categorizes overlapping triples into three types: non-overlapping (normal), entity pair overlap (EPO), and single entity overlap (SEO). EPO indicates multiple relationship types for a single entity pair, while SEO refers to a single entity appearing in multiple triples. Multiple triples refer to cases where a single sentence contains multiple triples. Due to the single representation of the features of the current joint extraction model, it is difficult to understand the semantic information of these complex statements, which affects their accuracy. For instance, in our annotated hydraulic engineering dataset, a sentence like “bridge pier and slope protection collapse” not only contains the triples <bridge pier, risk event, collapse> and <slope protection, risk event, collapse>, but both triples share the tail entity “collapse”. This illustrates both multiple triple and entity relationship overlap scenarios in the sentence. However, existing joint extraction models often only correctly extract one of these triples.

The crux to extracting the substantive relationships of complex statements is to understand the complex semantic information. Li et al. [8] proposed an entity naming recognition method with embedded dependencies, and proved that inter-word dependency features can enhance the semantic representation ability of the model through comparative experiments. Graph convolutional neural network (GCN) is a type of neural network that extracts features from graph structure data. The use of GCN based on syntactic dependence to learn the inter-word dependence features of sentences can effectively enhance the feature representation ability of the model. Additionally, Qin et al. [9] integrated bidirectional gated recurrent units (BiGRU) into the Bidirectional Encoder Representations from Transformers (BERT) model to augment contextual features, thereby improving semantic understanding capabilities. However, Xu et al. [10] showed that the ability to represent sentences generated only by sentence fragments is limited, and global information can effectively supplement the knowledge representation ability of the model. Embedding a global context mechanism (GCM) into the BiGRU framework supplements global features into each character’s feature representation. Therefore, we first use BERT pre-trained models to obtain sentence feature representations. Subsequently, we employ BiGRU embedded with GCM (BiGRU-GCM) to further enhance contextual and global features. Additionally, we utilize syntactic dependencies and GCN to learn inter-word dependency information within sentences. The main contributions of this study are:

Using BiGRU-GCM to complement contextual and global features. We input the feature vectors obtained from the BERT model into BiGRU-GCM, enabling further extraction of contextual and global features from sentences to enhance semantic representation capabilities.
Learning inter-word dependency features through GCN. We utilize the syntactic dependencies within sentences to construct a word adjacency matrix. Subsequently, GCN is employed to extract inter-word dependency features. Finally, we integrate these features with contextual information to derive comprehensive multi-feature representations of sentences, thereby enhancing the model’s performance.
Applying a multi-feature joint extraction approach in the field of hydraulic engineering. We utilize a method based on multiple features for entity relation joint extraction to extract knowledge triplets from operation and management data in hydraulic engineering. This lays the foundation for constructing a knowledge graph in the field.

The remainder of this paper is structured as follows: Section 2 introduces related work. Section 3 provides a detailed introduction to the model’s architecture, the design of the feature extraction module, and the multi-task framework. Section 4 includes a brief introduction to the datasets, evaluation criteria, baseline models used, and an analysis of conducted experiments. Finally, Section 5 discusses the conclusions of this study and outlines our future research focus.

2. Related Work

2.1. Advancements in Entity Relation Extraction Techniques

Entity relationship extraction forms the foundation for constructing knowledge graphs. In its early stages, this process primarily relied on rule-based and dictionary-based methods. AONE et al. [11] designed rule templates to screen textual data and identify correct entity relationship triplets. Although simple and effective, this method requires the construction of dictionaries and rules, and subsequent use to filter matching strings. This approach demands substantial human effort due to the need for specialized domain knowledge, resulting in limited universality.

Machine learning is one of the predominant research methodologies in current natural language processing (NLP) studies. Zhou et al. [12] employed hidden Markov models (HMM) to extract entities such as names, times, and quantities, achieving noteworthy results. Finkel et al. [13] proposed a NER method based on conditional random fields (CRF), performing entity extraction tasks on annotated data. Machine learning techniques diminish the necessity for manual intervention, thereby overcoming the constraints of rule-based approaches. Nonetheless, their accuracy heavily hinges on the quality of annotated data.

With the advancement of deep learning technologies, researchers have explored the application of neural networks in entity relation extraction tasks. Zeng et al. [14] introduced convolutional neural networks (CNN) to NLP for text-based entity extraction. Zheng et al. [15] proposed a sequence labeling method for entity relation extraction, enhancing annotation techniques using deep learning models to identify entities and relations simultaneously, demonstrating effective performance across various domains. Wu et al. [16] utilized a BiGRU-based approach for entity extraction from Chinese texts. The introduction of neural network models has significantly improved the effectiveness of knowledge extraction tasks.

Due to their robust semantic understanding capabilities, pre-trained models have emerged as a popular approach for entity relation extraction research. Google’s BERT [17] utilizes self-supervised learning techniques to effectively extract semantic information from large-scale unlabeled text data. Applying BERT to joint extraction tasks significantly enhances performance. Sun et al. [18] leveraged the BERT pre-trained model to capture sentence features, integrating it with other deep learning models for named entity recognition tasks. In the domain of unstructured mineral text data, Yu et al. [19] employed BERT as a foundational encoder, combining various neural network and machine learning models to evaluate different configurations. Experimental findings have demonstrated that the integration of BERT with neural networks and machine learning models excels in knowledge extraction tasks. RoBERTa [20] and ALBERT [21] are variations of BERT. RoBERTa’s model utilizes dynamic masking, which enhances performance but demands greater computational resources. ALBERT reduces parameter count through parameter sharing and factorized embedding matrices, thereby improving training efficiency and inference speed, although it may have limitations in complex tasks. Consequently, this study adopts the Chinese version of Google’s BERT as the fundamental model for extracting sentence features.

2.2. Joint Extraction of Entity Relations

Due to the accumulation of errors in pipeline methods, joint extraction methods of entity relations have become the current research focus. Yu et al. [22] proposed a span-based annotation scheme, deconstructing the joint extraction task of entity relations into a sequence labeling problem. Through a multi-span decoding algorithm, semantic relevance between relation prediction and entity recognition is fully captured, effectively eliminating invalid triples, but without solving the overlapping triple problem. With the remarkable performance of graph convolutional neural networks in the NLP field [23,24], the use of graph convolution in entity relation joint extraction tasks has become a new research direction. Lai et al. [25] proposed an improved graph convolutional network-based entity relation joint extraction method. By introducing a multi-head attention mechanism to enhance the graph convolutional network, the model’s robustness is improved. Experimental results demonstrate enhanced knowledge mining ability by graph convolutional networks. Geng et al. [26] improved the model’s performance by combining attention-based convolutional neural networks and recursive neural networks to obtain rich semantic representations. Zheng et al. [27] improved the structure of the joint extraction model and proposed a BERT-based multi-task joint extraction method. This method divides the joint extraction of entity relations into multiple subtasks, improving extraction efficiency and solving the problem of relationship redundancy, but it still faces the challenge of single feature representation.

2.3. Artificial Intelligence and Hydraulic Engineering

In recent years, with the continuous advancement of artificial intelligence technology, an increasing number of researchers have turned their attention to knowledge extraction in the field of hydraulic engineering. Zhang et al. [28] proposed Multi-Channel, a method aimed at managing the substantial textual data generated during hydraulic engineering construction. This method specifically targets the identification of hidden issues in construction documents to ensure the quality of hydraulic engineering projects. Similarly, Liu et al. [29] utilized a BERT-BiLSTM-CRF model to extract entities such as risks and project details from hydraulic engineering datasets, constructing a specialized knowledge graph. Leveraging this graph, they implemented a knowledge inference model to develop emergency response plans.

This study proposes a method for joint extraction of entity relations based on multiple features, applied specifically to the domain of hydraulic engineering. This method improves the extraction of knowledge triplets from datasets concerning hydraulic engineering operation and management, thereby facilitating the construction of a hydraulic engineering knowledge graph.

3. Methods

The present model adopts a multitask framework comprising three subtasks: relation prediction, entity-relation alignment, and entity recognition. Its feature extraction module primarily includes the BERT pre-trained model, a BiGRU-GCM network layer, and a syntactic dependence-based GCN. The overall structure of the model is illustrated in Figure 1. Initially, BERT extracts sentence feature vectors and predicts potential relations within the sentence. Concurrently, these vectors undergo processing by BiGRU-GCM to enhance bidirectional contextual and global features of the sentence. The global features contribute to generating the entity-relation alignment matrix. Subsequently, the sentence’s dependency tree constructs the word adjacency matrix, enabling the GCN to learn inter-word dependency features. These features are integrated with contextual features to perform entity recognition based on relation prediction. Finally, the model combines predicted relations with extracted entity pairs and compares them with the entity-relation alignment matrix to derive correct triples.

This section provides a detailed introduction to the feature extraction modules used in Section 3.1, Section 3.2, Section 3.3 and Section 3.4. Section 3.5 introduces the sub-tasks and explains the features they utilize. Section 3.6 then introduces the model’s loss function.

3.1. The BERT Pre-Training Model

This study utilizes the pre-trained BERT model for text feature extraction. BERT employs a multi-layer bidirectional transformer as its encoder, with each layer containing 12 attention heads to consider contextual relationships through attention mechanisms and encode sentences. Pre-training of BERT involves two tasks: masked language modeling and next sentence prediction. In masked language modeling, BERT randomly masks some words and predicts them based on contextual information from surrounding words. Next sentence prediction determines whether a subsequent sentence follows the input sentence. Post pre-training, BERT demonstrates enhanced language understanding capabilities.

For a text sequence

T = \{t_{1}, t_{2}, t_{3} \dots t_{n}\}

of length n, the BERT model obtains the input vector by combining word embedding

u_{T}

, segmentation embedding

u_{P}

, and position embedding

u_{P}

. The output sequence is then derived through the encoding layers. The formula for calculating the input vector of BERT [17] is shown in Equation (1).

U = u_{T} + u_{S} + u_{P}

(1)

where

U

represents the input vector of the BERT model;

u_{T}

denotes the word embedding, which has a dimensionality of 786;

u_{S}

is the segment embedding vector that distinguishes between which sentence each word is located in the input sequence; and

u_{P}

represents the positional embedding vector, which provides positional information for each token.

3.2. BiGRU-GCM

After utilizing the BERT model to obtain sentence features, we apply BiGRU-GCM to extract both local context and global features from the sentences.

3.2.1. BiGRU Model

GRU is a recurrent neural network that employs a gating mechanism to effectively mitigate issues such as gradient explosion and vanishing gradient [30]. It retains sequential semantic information through update and reset gates. BiGRU integrates both forward and backward GRU units to capture bidirectional contextual information. We input the feature vectors produced by the BERT model into the BiGRU layer, which utilizes both the forward encoder

\vec{G R U}

and backward encoder

\overset{\leftarrow}{G R U}

to capture contextual features of sentences. The specific formula for obtaining contextual features with BiGRU is as follows:

\vec{H_{i}} = \vec{G R U} (H_{i - 1}, s_{i})

(2)

{\overset{\leftarrow}{H}}_{i} = \overset{\leftarrow}{G R U} (H_{i + 1}, s_{i})

(3)

H_{i} = ({\vec{H}}_{i}; {\overset{\leftarrow}{H}}_{i})

(4)

Here,

s_{i}

represents the

i

-th output of BERT, and

H_{i}

indicates the bidirectional context feature vector of the

i

-th word.

3.2.2. Global Context Mechanism

In the contextual features obtained by BiGRU, each word’s feature vector captures fragments of both forward and backward information from the sentence, which inherently limits the representational capacity of the sentence features. According to the calculation formula of BiGRU, the first feature vector output contains complete backward information of the sentence, while the last feature vector contains complete forward information. The GCM approach aims to augment these local context features with global features that encompass comprehensive information about the entire sentence. This enhancement boosts the overall representational capability of sentence features. The specific algorithmic process is outlined as follows:

(1): $H_{i}$ is the output vector of the BiGRU model that includes contextual information of the word $t_{i}$ . ${\vec{H}}_{n}$ represents the forward global feature and ${\overset{\leftarrow}{H}}_{1}$ represents the backward global feature of the sentence. Therefore, $C = ({\vec{H}}_{n}; {\overset{\leftarrow}{H}}_{1})$ is the global feature of the entire sentence that incorporates bidirectional information.
(2): By computing the weights, we integrate the global features of the sentence into the contextual features of each word. The formula for calculating these weights is as follows:

$I H_{i} = σ (W_{H} (C; H_{i}) + b_{H})$

(5)

$I C_{i} = σ (W_{C} (C; H_{i}) + b_{C})$

(6)

In the formula,

{I H}_{i}

represents the contextual feature weight of the

i

-th character;

{I C}_{i}

denotes the global feature weight;

σ

is the sigmoid activation function;

W_{H} {\in R}^{2 d \times d}

and

W_{C} {\in R}^{2 d \times d}

are the trainable weight matrices; and

b_{H}

and

b_{C}

are the corresponding bias terms.

(3): Finally, we multiply the calculated weights by their corresponding feature vectors and then sum them to obtain the output $O$ of the GCM layer, where each word’s feature vector $O_{i}$ encapsulates the global features of the sentence. The output of the GCM layer is computed using the following formula:

$O_{i} = H_{i} \otimes I H_{i} + C \otimes I C_{i}$

(7)

3.3. Syntactic Dependency Based GCN

To obtain inter-word dependency features from a sentence, we first utilize a tool to generate the syntactic dependency tree. Next, we convert this tree into a word adjacency matrix. Finally, we extract the dependency features from the adjacency matrix using GCN.

3.3.1. Syntactic Dependency

This study utilizes the spaCy tool to extract syntactic dependencies from each sentence and constructs a syntactic dependency tree using predefined rules. To effectively focus the model on word dependencies, we initially analyze individual word information and then connect each word with the initial character of its descendant words. This approach captures both intra-word information and inter-word dependencies. The process of constructing the word adjacency matrix is illustrated in Figure 2.

In Figure 2, black lines between characters represent internal connections within words consisting of multiple characters. Red lines denote connections between the initial characters of two words that have a dependency relationship. For example, ‘slope’ and ‘protection’ are constituent characters of the compound word ‘slope protection’; the black line between them represents the internal connection within ‘slope protection’. The red line between ‘slope’ and ‘collapsed’ indicates the syntactic dependency relationship between ‘slope protection’ and ‘collapsed’. In the word adjacency matrix, we assign a value of 1 to characters that are connected and 0 otherwise. The green area signifies internal connections within words, while the blue area represents dependencies between words.

3.3.2. GCN

The word adjacency matrix converts inter-word dependency relationships into a graph structure, and GCNs are commonly employed for feature extraction from graph data. Therefore, this study utilizes GCN to extract inter-word dependency features from the word adjacency matrix. For a graph structure with nodes

V

and edges

E

, GCN aggregates the features of each node and its neighboring nodes based on the edges [31]. We perform graph convolution calculations on the feature vectors output by the BERT model using a syntactic dependency word adjacency matrix, allowing us to capture inter-word dependency features within sentences. The calculation formula for GCN is as follows:

G = R e L U (\bar{A} (S W_{G}) + b_{G})

(8)

where

\bar{A} = A + I

is the adjacency matrix where the sentence contains its own information;

A

is the dependent adjacency matrix;

I

is the identity matrix;

W_{G} \in R^{d \times d}

is a trainable weight matrix;

b_{G}

is the bias term; and

S

is the feature vector encoding sentences by the BERT pre-trained model.

3.4. Feature Fusion

Finally, we fuse the obtained context features and inter-word dependency features, and dynamically adjust the fusion weights using gated units. The calculation formula is as follows:

g = σ (W_{g} G + b_{g})

(9)

e = G \otimes g + H \otimes (1 - g)

(10)

where

g

represents the fusion weight of inter-word dependency features;

G

denotes the feature vector of inter-word dependency;

H

signifies the feature vector of context;

W_{g} \in R^{d \times d}

stands for the trainable parameter; and

b_{g}

is the corresponding bias term. By the dynamic weight

g

, the inter-word dependency and context features are integrated to obtain the joint feature

e

.

3.5. Multi-Task Joint Extraction Framework

This study adopts a multi-task framework, dividing the joint extraction of entity relations into three subtasks: relation prediction, entity-relation correspondence, and entity recognition. Using this framework can effectively improve the efficiency of joint extraction and address the issue of relation redundancy.

3.5.1. Relationship Prediction

Unlike traditional joint extraction models that extract entities first and then predict the relationships between them, this study adopts a different approach. It predicts the relationships present in the sentence first and then performs entity recognition. This method aims to address the issue of relationship redundancy. Given a sentence feature vector

S

output by BERT, the subtask of relationship prediction can be achieved by applying average pooling and fully connected layers to determine the potential relationships that may exist in the sentence. The specific definition is as follows:

P_{r} = σ (W_{r} (A v g p o o l (S) + b_{r})

(11)

where

A v g p o o l

denotes the average pooling operation;

P_{r}

represents the probability of relationship

r

existence within the sentence;

W_{r} \in R^{r e l \times d}

denotes the trainable weight parameter; and

b_{r} \in R^{r e l \times 1}

denotes the bias term, where

r e l

represents the total number of relation categories.

3.5.2. Entity Relationship Correspondence

In entity relation correspondence tasks, we utilize the global feature vectors obtained from BiGRU-GCM to determine whether there exist head and tail entities corresponding to the same triplet in a sentence, thereby deriving the corresponding matrix

M

of entity relations. The definition rule for the corresponding matrix is:

M (i, j) = \{\begin{matrix} 1 i \to j \\ 0 o t h e r s \end{matrix}

(12)

where

i \to j

represents pairs of entities in same triplet;

i

is the first word representing the head entity; and

j

is the first word representing the tail entity. The determination of their corresponding relationship is obtained by computing the likelihood

P_{i j}

of a relationship between the two words. Firstly, we set a threshold. When the magnitude of

P_{i j}

exceeds this threshold, we determine that there exists a relationship between two words, and we set the value at the corresponding position in the matrix to 1. The definition of

P_{i j}

is as follows:

P_{i j} = σ (W_{m} (O_{i}; O_{j}) + b_{m})

(13)

where

O

is the feature vector that includes global feature, and

W_{m}

and

b_{m}

are parameters of the fully connected layer.

3.5.3. Entity Recognition

In the relation prediction task, once potential relationships within sentences are identified, we encode these relationships to derive their feature vectors. These relation feature vectors are then integrated with joint feature vectors that combine contextual and inter-word dependency features. Finally, by predicting the positions of the head and tail entities in the sentence, we determine the corresponding head and tail entities for each relationship. The specific calculation formula is as follows:

{P_{i j}}^{h e a d} = s o f t \max (W_{h e a d} (e_{i} + r_{j}) + b_{h e a d})

(14)

{P_{i j}}^{tail} = s o f t \max (W_{t a i l} (e_{i} + r_{j}) + b_{t a i l})

(15)

where

e_{i}

represents the i-th word vector fused with contextual features and dependency information;

r_{j}

denotes the vector of the i-th relationship existing in the sentence;

W_{h e a d}

and

W_{t a i l}

denote the trainable weight matrices, and

b_{h e a d}

and

b_{t a i l}

represent the corresponding bias terms; and

{P_{i j}}^{h e a d}

and

{P_{i j}}^{tail}

denote the probability of the i-th word being predicted as at the start, interior, or exterior position of the head and tail entity.

3.6. Loss Function

This paper sets up loss functions separately for three subtasks. The specific definition is as follows:

L_{r} = - \frac{1}{n^{r}} \sum_{i = 1}^{n^{r}} (y_{i} \log P_{r} + (1 - y_{i}) \log (1 - P_{r}))

(16)

L_{M} = - \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} (y_{i j} \log P_{i j} + (1 - y_{i j}) \log (1 - P_{i j}))

(17)

L_{e} = - \frac{1}{2 \times n \times m} \sum_{t \in {h e a d, t a i l}} \sum_{j = 1}^{m} \sum_{i = 1}^{n} y_{i j}^{t} \log P_{i j}^{t}

(18)

In the equation,

L_{r}

,

L_{M}

, and

L_{e}

represent the loss functions for relation prediction, entity-relation correspondence, and entity recognition, respectively;

n^{r}

denotes the size of the complete set of relations;

n

is the length of the sentence; and

m

is the number of potential relations in the sentence. The total loss function of the model is:

L = L_{r} + L_{M} + L_{e}

(19)

4. Experiment

4.1. Datasets

This study utilized inspection records and risk prevention manuals from the management division of the South-to-North Water Diversion Project in China as experimental data. The dataset comprehensively documents the content, occurrence time, and location of risk events, as well as risk prevention and control measures issued by the management division. Constructing a knowledge graph in the field of hydraulic engineering using this Chinese dataset can assist engineering management in addressing daily risks in hydraulic projects and provide intelligent services for operational management. After annotating the data, we formed a hydraulic engineering dataset containing 22,828 triples. The dataset was divided into training, testing, and validation sets in a ratio of 7:2:1, and was further segmented based on the overlap types of entity relationships in sentences and the number of triples.

The statistics of the dataset are shown in Table 1, where “Normal”, “SEO”, and “EPO” represent the types of overlapping triples. “N = 1”, “N = 2”, “N = 3”, “N = 4”, and “N ≥ 5”, respectively, indicate the number of entity-relation triples in a single sentence, with 1, 2, 3, 4, and more than 5 triples. “Train”, “Test”, “Valid”, and “ALL” represent the number of sentences in the training set, test set, validation set, and the entire dataset for each of the above types.

4.2. Evaluation Metrics

To accurately measure the model’s performance, this paper employed common evaluation metrics: precision (

P

), recall (

R

), and F1 score.

P

indicates the proportion of correctly identified triples among all identified triples, while

R

represents the proportion of correctly identified triples among all actual correct triples. F1 score, a comprehensive metric balancing precision and recall, effectively reflects the knowledge extraction model’s performance. Higher values of these metrics indicate better model performance. Their calculation formulas are as follows [32]:

P = \frac{T P}{T P + F P}

(20)

R = \frac{T P}{T P + F N}

(21)

F 1 = \frac{2 P \times R}{P + R}

(22)

where

T P

represents the extracted triples that were considered correct;

F P

represents the incorrect triples; and

F N

represents triples that were not recognized.

4.3. Experiment Settings

The BERT pre-trained model used in this study consisted of 12 layers of transformer encoders with word embeddings of dimensionality 768, totaling 110 million parameters. During training, the model had a batch size of 8, and the maximum sequence length of inputs was set to 70. Additionally, we conducted separate training for the parameters of the BERT pre-trained model and other parameters. The learning rate for the BERT model parameters was 0.0001, while for the remaining model parameters, it was 0.001.

4.4. Experimental Result and Analysis

4.4.1. Comparative Experiments on Entity Relation Extraction

To validate the effectiveness of our approach, this study selected four widely used and representative benchmark models: (1) CopyRe [7], which is a joint relation extraction model based on a copying mechanism. It copies entities from the source sentence and predicts relations from a predefined set of relations; (2) CasRel [33], which proposes a cascaded binary labeling framework to address the issue of relation overlap. It models relations as functions that map the head entity to the tail entity in the sentence and uses a pointer network with multi-layered relation labels for decoding; (3) TPLinker [34], which introduces a novel handshake tagging scheme that transforms the task of jointly extracting entity relations into a token-pair linking problem; and (4) PRGC [27], which proposes a method that first predicts relations and then performs pruning operations using a global correspondence matrix. Furthermore, to ensure the fairness of the experiments, we used the same experimental settings for all of these models. The experimental results are shown in Table 2. Bold font in the table is used to highlight the best results.

As seen in Table 2, our model demonstrated significantly better performance compared to the baseline model. Specifically, our model’s precision, recall, and F1 score were improved by 6.79%, 0.96%, and 3.90%, respectively, compared to PRGC. This is because although PRGC adopts a multi-task framework, it only utilizes BERT embeddings for input text, resulting in a single feature representation that limits the model’s ability to comprehend semantic information effectively. Of note, PRGC achieved higher precision compared to TPLinker by comparing the entity-relation correspondence matrix with predicted entity-relation triplets. In contrast, our model achieved a more significant improvement in precision compared to PRGC, which we attribute to the introduction of global features, enabling more accurate prediction of the entity-relation corresponding matrix, thereby eliminating more erroneous triplets in the final comparison stage. Overall, our model’s joint entity-relation extraction on the hydraulic engineering dataset proved effective.

4.4.2. Comparative Experiments in Complex Scenarios

According to Table 1, it can be seen that the sentences in the hydraulic engineering dataset of this paper can be divided into three types based on the types of overlapping entity relationship triples they contain: normal, SEO, and EPO. Based on these three types of overlap, we divided the entire dataset into three datasets with overlapping entity relationships. Furthermore, based on the number of triples contained in a single sentence, we created five multi-triple datasets.

To evaluate our model’s extraction performance in complex semantic environments, comparative experiments were conducted on these datasets. Additionally, as depicted in Table 2, PRGC exhibited the best overall performance among models other than ours. Hence, PRGC was chosen as the benchmark model for this comparative experiment. The results of the relationship overlap comparative experiment are presented in Table 3.

From the experimental results in the table, it can be observed that our model outperformed the baseline model on the overlapping triples datasets. For the normal, SEO, and EPO datasets, our method achieved F1 scores that were, respectively, 0.57%, 1.06%, and 0.94% higher than those of PRGC. This demonstrates that our method was effective in identifying entities and relationships in sentences containing overlapping triples in the hydraulic engineering dataset.

Continuing the comparative experiments on datasets with multiple triplets further demonstrated the effectiveness of our method in complex scenarios. The experimental results in the table show that our model performed well regardless of the number of triplets contained in the dataset. It is evident from the data in the table that our method achieved significant improvements over the baseline model in datasets with N = 3, N = 4, and N ≥ 5, with F1 score increases of 3.60%, 1.86%, and 1.50%, respectively. However, in datasets with N = 1 and N = 2, our method performed similarly to PRGC. We believe that in simple sentences, the dependency between words is weaker, and the effectiveness of global features may be affected, resulting in less noticeable improvements in the model’s performance.

This experiment validates the effectiveness of multi-feature extraction methods in complex scenarios. PRGC utilizes a single-feature extraction method that focuses solely on the sentence’s contextual information, potentially limiting its ability to capture the intricate semantic details within sentences. In contrast, our approach employs a multi-feature representation method that supplements global features and inter-word dependency features. This enhancement enables our model to achieve superior performance when confronted with complex sentences in the hydraulic engineering dataset.

4.4.3. Ablation Study

To assess the effectiveness of each model component, we conducted ablation experiments by partitioning the model into five variants: 1. BERT: baseline model utilizing the BERT encoder; 2. BERT-BiGRU: removes the contextual mechanism and GCN, employing bidirectional contextual features for entity relation mapping and recognition; 3. BERT-BiGRU-GCM: excludes the GCN based on syntactic dependency; 4. BERT-BiGRU-GC-GCN-Add: integrates contextual and dependency features through addition; and 5. BERT-BiGRU-GC-GCN-Gate: final model proposed in this study. Figure 3 illustrates the results of these ablation experiments.

As shown in the experimental results, the addition of various modules on top of the BERT model improved the performance of the model. Specifically, when the model incorporated the BiGRU-GCM module, the F1 score was improved by 1.34% compared to the baseline model. This indicates that the inclusion of context features and global features enhanced the model’s understanding ability, thereby improving its performance. Furthermore, when the model introduced inter-word dependency features and adopted an additive fusion method with context features, the F1 score of the model only improved by 1.10% compared to the baseline model. However, when using gating units for fusion, the F1 score of the model improved by 3.90%, and all metrics achieved their highest values. Through the experiments in this section, it was demonstrated that the multi-feature representation of context features, global features, and inter-word dependency features, along with the fusion mechanism of gating units, can enable the model to achieve optimal performance in joint extraction of entity relations tasks.

4.4.4. Model Parameter Experiment

This experiment investigated the impact of varying input sequence lengths on model performance. We configured the model with different sequence lengths and compared their precision, recall, and F1 scores accordingly. The experimental results are depicted in Figure 4.

The results depicted in the figure show that our model produced varying outcomes across sequence lengths of 60, 70, 80, 90, and 100. Optimal performance was observed at a sequence length of 70, whereas both shorter and longer lengths resulted in decreased effectiveness. This can be attributed to the use of BiGRU-GCM for global feature extraction in our model. Excessively long input sequences may cause the model to overlook global features in simple sentences, while overly short sequences may lead to the loss of semantic information in longer sentences. The experiments in this section confirm that a sequence length of 70 resulted in optimal performance for our approach on the studied hydraulic engineering dataset.

5. Conclusions

This paper proposes a multi-feature method for joint entity and relation extraction tailored for hydraulic engineering. The approach enhances sentence contextual and global features using BiGRU-GCM and employs a syntax-dependency-based GCN to extract inter-word dependency features from sentences. Experimental results demonstrate that the proposed method achieves superior performance on a Chinese hydraulic engineering dataset. Compared to the PRGC model, it improves precision, recall, and F1 score by 6.79%, 0.96%, and 3.90%, respectively. These results indicate that the multi-feature representation method effectively mitigates accuracy decline caused by insufficient feature representation capability during joint entity and relation extraction from complex sentences in hydraulic engineering texts, thereby facilitating the construction of knowledge graphs in hydraulic engineering.

However, the method proposed in this paper is built upon the BERT pre-trained model. There have been many BERT-based variants developed, such as RoBERTa and ALBERT, which enhance the performance of BERT by making improvements to the pre-training model. In the future, we plan to explore the use of different pre-trained models to further enhance the performance of our model. Additionally, this study only investigates the impact of Chinese syntactic dependencies on the joint extraction task. Subsequently, we aim to research the application of this method in other languages.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, X.W.; validation, Y.L. and X.W.; formal analysis, X.W.; investigation, X.W.; resources, Y.L.; data curation, X.L., Z.R. and Y.W.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L.; visualization, X.W.; supervision, Y.L.; project administration, Y.L., Z.R. and Q.C.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Education Department of Henan Province grant number 24A520021. And The APC was funded by The Education Department of Henan Province.

Data Availability Statement

The data that support the findings of this study can be accessed upon reasonable request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Yan, J.; Lv, T.; Yu, Y. Construction and recommendation of a water affair knowledge graph. Sustainability 2018, 10, 3429. [Google Scholar] [CrossRef]
Tuo, M.; Yang, W. Review of entity relation extraction. J. Intell. Fuzzy Syst. 2023, 44, 7391–7405. [Google Scholar] [CrossRef]
Wang, C.; Ma, X.; Chen, J.; Chen, J. Information extraction and knowledge graph construction from geoscience literature. Comput. Geosci. 2018, 112, 112–120. [Google Scholar] [CrossRef]
Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Yu, P.S. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 494–514. [Google Scholar] [CrossRef]
Zhang, Q.; Chen, M.; Liu, L. A Review on Entity Relation Extraction. In Proceedings of the 2017 Second International Conference on Mechanical, Control and Computer Engineering (ICMCCE), Harbin, China, 8–10 December 2017; pp. 178–183. [Google Scholar]
Wang, X.; Yang, R.; Feng, Y.; Li, D.; Hou, J. A military named entity relation extraction approach based on deep learning. In Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence, Sanya, China, 21–23 December 2018; pp. 1–6. [Google Scholar]
Zeng, X.; Zeng, D.; He, S.; Liu, K.; Zhao, J. Extracting relational facts by an end-to-end neural model with copy mechanism. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; Volume 1: Long Papers, pp. 506–514. [Google Scholar]
Li, D.; Yan, L.; Yang, J.; Ma, Z. Dependency syntax guided BERT-BiLSTM-GAM-CRF for Chinese NER. Expert Syst. Appl. 2022, 196, 116682. [Google Scholar] [CrossRef]
Qin, Q.; Zhao, S.; Liu, C.J.C. A BERT-BiGRU-CRF Model for Entity Recognition of Chinese Electronic Medical Records. Complexity 2021, 2021, 6631837. [Google Scholar] [CrossRef]
Xu, C.; Shen, K.; Sun, H. Supplementary features of BiLSTM for enhanced sequence labeling. arXiv 2023, arXiv:2305.19928. [Google Scholar]
Aone, C.; Halverson, L.; Hampton, T.; Ramos-Santacruz, M. SRA: Description of the IE2 system used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, VA, USA, 29 April–1 May 1998. [Google Scholar]
Zhou, G.; Su, J. Named entity recognition using an HMM-based chunk tagger. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002; pp. 473–480. [Google Scholar]
Finkel, J.R.; Grenager, T.; Manning, C.D. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, MI, USA, 25–30 June 2005; pp. 363–370. [Google Scholar]
Zeng, D.; Liu, K.; Lai, S.; Zhou, G.; Zhao, J. Relation classification via convolutional deep neural network. In Proceedings of the COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, Ireland, 23–29 August 2014; pp. 2335–2344. [Google Scholar]
Zheng, S.; Wang, F.; Bao, H.; Hao, Y.; Zhou, P.; Xu, B. Joint extraction of entities and relations based on a novel tagging scheme. arXiv 2017, arXiv:1706.05075. [Google Scholar]
Wu, K.; Xu, L.; Li, X.; Zhang, Y.; Yue, Z.; Gao, Y.; Chen, Y. Named entity recognition of rice genes and phenotypes based on BiGRU neural networks. Comput. Biol. Chem. 2024, 108, 107977. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K.J. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Sun, M.; Guo, Z.; Deng, X. Intelligent BERT-BiLSTM-CRF based legal case entity recognition method. In Proceedings of the ACM Turing Award Celebration Conference-China, Hefei, China, 30 July–1 August 2021; pp. 186–191. [Google Scholar]
Yu, Y.; Wang, Y.; Mu, J.; Li, W.; Jiao, S.; Wang, Z.; Lv, P.; Zhu, Y. Chinese mineral named entity recognition based on BERT model. Expert Syst. Appl. 2022, 206, 117727. [Google Scholar] [CrossRef]
Liu, Z.; Lin, W.; Shi, Y.; Zhao, J. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the China National Conference on Chinese Computational Linguistics, Harbin, China, 3–5 August 2021; pp. 471–484. [Google Scholar]
Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R.J. Albert: A lite bert for self-supervised learning of language representations. arXiv 2019, arXiv:1909.11942. [Google Scholar]
Yu, B.; Zhang, Z.; Shu, X.; Wang, Y.; Liu, T.; Wang, B.; Li, S. Joint extraction of entities and relations based on a novel decomposition strategy. arXiv 2019, arXiv:1909.04273. [Google Scholar]
Yao, L.; Mao, C.; Luo, Y. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 7370–7377. [Google Scholar]
Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph convolutional networks: A comprehensive review. Comput. Soc. Netw. 2019, 6, 11. [Google Scholar] [CrossRef] [PubMed]
Lai, Q.; Zhou, Z.; Liu, S. Joint entity-relation extraction via improved graph attention networks. Symmetry 2020, 12, 1746. [Google Scholar] [CrossRef]
Geng, Z.; Zhang, Y.; Han, Y. Joint entity and relation extraction model based on rich semantics. Neurocomputing 2021, 429, 132–140. [Google Scholar] [CrossRef]
Zheng, H.; Wen, R.; Chen, X.; Yang, Y.; Zhang, Y.; Zhang, Z.; Zhang, N.; Qin, B.; Xu, M.; Zheng, Y. PRGC: Potential relation and global correspondence based joint relational triple extraction. arXiv 2021, arXiv:2106.09895. [Google Scholar]
Zhang, D.; Li, M.; Tian, D.; Song, L.; Shen, Y. Intelligent text recognition based on multi-feature channels network for construction quality control. Adv. Eng. Inform. 2022, 53, 101669. [Google Scholar] [CrossRef]
Liu, X.; Lu, H.; Li, H. Intelligent generation method of emergency plan for hydraulic engineering based on knowledge graph–take the South-to-North Water Diversion Project as an example. LHB 2022, 108, 2153629. [Google Scholar] [CrossRef]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef] [PubMed]
Zheng, Z.; Liu, M.; Weng, Z. A Chinese BERT-Based Dual-Channel Named Entity Recognition Method for Solid Rocket Engines. Electronics 2023, 12, 752. [Google Scholar] [CrossRef]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A novel cascade binary tagging framework for relational triple extraction. arXiv 2019, arXiv:1909.03227. [Google Scholar]
Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage joint extraction of entities and relations through token pair linking. arXiv 2020, arXiv:2010.13415. [Google Scholar]

Figure 1. Overall architecture of the model.

Figure 2. Word adjacency matrix based on syntactic dependency.

Figure 3. Comparative experimental results of ablation analysis.

Figure 4. Comparative experimental results of parameter variations in sequence length.

Table 1. Dataset statistics.

Dataset	Normal	SEO	EPO	N = 1	N = 2	N = 3	N = 4	N ≥ 5
Train	2923	1772	330	2518	655	507	393	622
Test	835	506	95	719	187	145	112	178
Valid	418	253	47	360	94	72	56	89
ALL	4176	2531	472	3597	936	724	561	889

Table 2. Performance comparison of entity-relation extraction among different models. The model from the present study is compared with the baseline model using precision (P), recall (R), and F1-score.

Method	P	R	F1
CopyRe	0.417	0.386	0.401
CasRel	0.810	0.795	0.802
TPLinker	0.801	0.829	0.815
PRGC	0.810	0.833	0.821
Ours	0.865	0.841	0.853

Table 3. Performance comparison of models in complex environments.

Complex Scenarios		PRGC			Ours
Complex Scenarios	P	R	F1	P	R	F1
Normal	0.897	0.859	0.877	0.906	0.859	0.882
SEO	0.890	0.817	0.852	0.883	0.840	0.861
EPO	0.866	0.839	0.853	0.867	0.855	0.861
N = 1	0.957	0.955	0.956	0.968	0.959	0.964
N = 2	0.928	0.926	0.927	0.944	0.909	0.926
N = 3	0.859	0.807	0.832	0.886	0.840	0.862
N = 4	0.841	0.775	0.806	0.841	0.802	0.821
N ≥ 5	0.834	0.770	0.800	0.865	0.758	0.812

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Y.; Wang, X.; Liu, X.; Ren, Z.; Wang, Y.; Cai, Q. Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features. Electronics 2024, 13, 2979. https://doi.org/10.3390/electronics13152979

AMA Style

Liu Y, Wang X, Liu X, Ren Z, Wang Y, Cai Q. Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features. Electronics. 2024; 13(15):2979. https://doi.org/10.3390/electronics13152979

Chicago/Turabian Style

Liu, Yang, Xingzhi Wang, Xuemei Liu, Zehong Ren, Yize Wang, and Qianqian Cai. 2024. "Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features" Electronics 13, no. 15: 2979. https://doi.org/10.3390/electronics13152979

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Joint Extraction Method for Hydraulic Engineering Entity Relations Based on Multi-Features

Abstract

1. Introduction

2. Related Work

2.1. Advancements in Entity Relation Extraction Techniques

2.2. Joint Extraction of Entity Relations

2.3. Artificial Intelligence and Hydraulic Engineering

3. Methods

3.1. The BERT Pre-Training Model

3.2. BiGRU-GCM

3.2.1. BiGRU Model

3.2.2. Global Context Mechanism

3.3. Syntactic Dependency Based GCN

3.3.1. Syntactic Dependency

3.3.2. GCN

3.4. Feature Fusion

3.5. Multi-Task Joint Extraction Framework

3.5.1. Relationship Prediction

3.5.2. Entity Relationship Correspondence

3.5.3. Entity Recognition

3.6. Loss Function

4. Experiment

4.1. Datasets

4.2. Evaluation Metrics

4.3. Experiment Settings

4.4. Experimental Result and Analysis

4.4.1. Comparative Experiments on Entity Relation Extraction

4.4.2. Comparative Experiments in Complex Scenarios

4.4.3. Ablation Study

4.4.4. Model Parameter Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI