1. Introduction
The development of automated vessels, artificial intelligence algorithms, and big data technologies is continuously driving the advancement of intelligent ships [
1,
2]. In the process of intelligent ship development, the safety and reliability of marine intelligent engine room systems directly impact the safety of ship navigation. Analyzing, mining, and preserving information data resources within the marine engine room, organizing knowledge resources in an orderly manner, and establishing a unified data source for the field of marine engine room systems are the foundations of complex knowledge accumulation. Text, one of the largest and most common data sources on ships, contains important information, such as engine logbooks and oil record books. Therefore, exploring flexible and scalable methods for accumulating knowledge in marine engine room systems is significant.
A knowledge graph is a graphical form of knowledge organization with good readability, scalability, and interpretability [
3]. It can provide a decision-making basis for artificial intelligence. Entities are the most important component of the knowledge graph, and NER is of great significance for constructing knowledge graphs. NER technology can extract key information from text, such as proprietary terms such as time, equipment, and systems [
4]. At the same time, the recognition effect significantly impacts future work, such as relationship extraction and knowledge graph construction. NER technology can rely on general or specific domain knowledge as prior knowledge. Although general domain knowledge covers more entities and has better universality, precision is more emphasized in specific domain applications, such as entity recognition in military knowledge domains, medical knowledge domains, and network operations [
5,
6,
7]. Currently, there is limited research on NER in the field of marine engine room systems, and a quantity of unstructured semantic information has been accumulated in the operation and maintenance of marine engine room equipment in actual ship operations. Therefore, entity recognition for marine engine rooms is not only conducive to the later establishment of marine engine room information knowledge graphs, mining a large amount of implicit knowledge, but can also provide auxiliary decision making for intelligent engine room operation and maintenance tasks.
NER methods can be classified into three categories based on their development history: rule-based, machine learning-based, and deep learning-based [
8]. Rule-based methods match named entities directly from sentences using dictionaries or rule templates specified by domain experts. Chiticariu, L. et al. (2010) [
9] proposed an advanced rule language for building and customizing NER annotators, demonstrating their effectiveness across different domains. However, rule-based methods rely on expert-defined templates, which can be challenging to create and may only cover some possible variations. Additionally, they suffer from high transfer costs and are limited to handling simple text data, making it difficult to handle complex organizational data. Machine learning-based NER methods involve manually selecting features and then classifying them. These methods include hidden Markov models (HMM) [
10], maximum entropy models (MEM) [
11], support vector machines (SVM) [
12], and others. The double-layer HMM model has performed well in Chinese term recognition and extraction. However, such methods heavily rely on text features and often need more generalizability.
On the other hand, deep learning-based named entity recognition can automatically learn hidden features from text information, eliminating the need for complex feature extraction processes. As a result, deep learning approaches have gained widespread attention in practical applications due to their ability to uncover meaningful patterns and their potential for better generalization. Collobert, R. et al. (2011) [
13] introduced a word-level convolutional neural network (CNN) model that utilizes the output of the convolutional layer for prediction with a CRF layer. The model achieved a favorable
F1 score of 89.59% based on the English CoNLL2003 dataset. Huang, Z. et al. (2015) [
14] incorporated manually designed spelling features into a BiLSTM-CRF model and achieved an
F1 score of 88.83% on the CoNLL2003 dataset. Additionally, Wu, F. et al. (2019) [
15] proposed a joint segmentation and CNN-BiLSTM-CRF model, enhancing the model’s ability to recognize boundaries in Chinese-named entities. The study also introduced a method for generating pseudo-labeled samples from existing annotated data, further improving entity recognition performance. Liu, W. et al. (2019) [
16] proposed a WC-LSTM model that achieved an F1 score of 93.74% on the Microsoft Research (MSRA) public dataset. Adding word information to the beginning or end position of characters enhanced the semantic information. Similarly, using a fragment-based neural network structure, Wang, L. et al. (2018) [
17] achieved automatic feature learning and obtained an
F1 score of 90.44% on the MSRA dataset.
However, the abovementioned research methods cannot effectively capture the polysemy of words because they mainly focus on feature extraction of individual words, characters, or word-to-word relationships, neglecting contextual semantic information. Consequently, the extracted representations are static word vectors that do not include contextual information, leading to decreased entity recognition performance. In 2018, researchers at Google introduced a pre-training model called BERT, based on the existing attention mechanism [
18]. BERT utilizes bidirectional encoding using transformers and has shown promising performance in named entity recognition tasks [
19]. Currently, the research of the BERT algorithm model in the field of NER of marine engine rooms is rare because of the following reasons:
(1) A single BERT model is insufficient for comprehensively capturing the semantic information of texts in the domain of marine engine rooms.
(2) The softmax layer in the BERT model is unable to effectively handle the requirements of the output sequence for entity labels in the marine engine room domain. As a result, the predicted output sequence may lack coherence and fail to align with the actual entity labeling requirements in real-world scenarios.
In order to solve these two issues, this study focuses on the NER for marine engine room semantics. A modified model called BERT-BiLSTM-CRF is proposed, which combines BERT with the bidirectional long short-term memory network and the conditional random field model. By leveraging the model’s contextual semantic extraction and feature prediction capabilities, it tackles the problems of incomplete semantic information extraction and unrealistic output sequences, thereby improving the accuracy of named entity recognition. The experiment results on marine engine room texts demonstrate promising recognition performance, laying the foundation for establishing a marine engine room information knowledge graph.
The remaining parts of this paper are organized as follows:
Section 2 introduces the framework of the BERT-BiLSTM-CRF named entity recognition model.
Section 3 presents the semantic entity recognition design process in the marine engine room, including data annotation methods and the selection of evaluation metrics.
Section 4 presents the experimental results of the marine engine room semantic entity recognition case study and validates the effectiveness of the model improvements.
Section 5 concludes the paper and outlines future work plans and directions for semantic recognition in marine engine room systems.
2. The Method for Semantic Entity Recognition in Marine Engine Room Systems
The BERT-BiLSTM-CRF entity recognition model consists of three components: the BERT pre-trained word embedding model, the BiLSTM semantic extraction layer, and the CRF decoding layer. The marine engine room semantic entity extraction model based on the improved BERT-BiLSTM-CRF algorithm is illustrated in
Figure 1.
2.1. Bert Model
BERT is a pre-trained language representation model in natural language processing. It can infer the relationships between words in a text and optimize the weights to extract feature information from the text. Compared with traditional pre-trained models, BERT is based on a self-attention mechanism that allows it to learn relationships between consecutive text segments and capture contextual information. The pre-training structure of the BERT model is illustrated in
Figure 2.
As shown in
Figure 2, the BERT model consists of several layers. The first layer is the input layer, representing the word embeddings (
E1,
E2, …,
EN). The second and third layers are the encoding structure layers, where “
Trm” denotes the transformer encoding transformers that utilize the self-attention mechanism. The fourth layer represents the model’s output vectors (
T1,
T2, …,
TN). Here,
N refers to the total number of input tokens or words.
Trm, a key component of the BERT model, has an encoding structure as shown in
Figure 3. To obtain word representations,
Trm primarily optimizes the weight coefficient matrix based on the degree of correlation between words within the same sentence. The formula for the output of the self-attention mechanism is given by Equation (1).
In the equation, Q represents the feature matrix of the sample, K represents the feature matrix of the text information, V represents the content matrix of the text information, and dk represents the dimension of the matrix K.
The BERT model utilizes a multi-head attention mechanism based on self-attention. The number of heads corresponds to the number of self-attention mechanisms. Each attention mechanism focuses on different contextual information of the same word in this mechanism. The calculation formula for the output matrix,
MultiHead, is given by Equation (2).
In the equation, hi represents the output matrix of the i-th word. WiQ, WiK, and WiV are the weight matrices for Q, K, and V, respectively. Concat denotes the concatenation of each hi matrix, which is then multiplied by the concatenation matrix.
Finally, the fourth layer of the BERT model outputs the word vector representation. This model can obtain character vectors rich in semantic information from the ship’s cabin training text, enabling comprehensive storage of the text’s semantic information.
2.2. BiLSTM Model
Long short-term memory (LSTM) is a type of recurrent neural network (RNN) that effectively addresses the issues of gradient explosion and vanishing gradients during training by introducing memory cells and gate mechanisms [
20]. The core components of LSTM are the forget gate (
f), the input gate (
i), the output gate (
o), and the memory cell (
c). The forget gate is responsible for discarding irrelevant information from previous inputs, the input gate is used to retain relevant current information, the memory cell combines useful information from previous and current inputs, and the output gate outputs the final modified information. The structural expression of LSTM at time step t is as follows:
At time t, ht represents the output of LSTM, xt denotes the input information, wf is the weight matrix of the forget gate, bf represents the bias of the forget gate, wi stands for the weight matrix of the input gate, bi corresponds to the bias of the input gate, wo represents the weight matrix of the output gate, bo is the bias of the output gate, σ represents the sigmoid function, and “*” denotes the multiplication operation.
Due to the inability of a unidirectional LSTM model to simultaneously process contextual information, Graves, A. and others proposed the BiLSTM neural network [
21]. BiLSTM employs forward and backward LSTMs for each word sequence and combines the output information from the same time step. BiLSTM allows for storing information in both the forward and backward directions. The specific structure of BiLSTM is shown in
Figure 4, and the output formula is given by Equation (4).
In the equation, htp represents the storage of forward information, and htb represents the storage of backward information.
2.3. CRF Model
In the task of NER, BiLSTM performs well in capturing long-distance contextual information but cannot capture dependencies between adjacent labels. Conditional random field (CRF), on the other hand, can leverage the relationships between neighboring labels to obtain an optimal predicted sequence. It helps address the limitations of BiLSTM and enables the segmentation and labeling of sequential data [
22]. The scores for all possible labels of each word output by the BiLSTM model are used as the input score matrix for CRF, and the specific score transmission is illustrated in
Figure 5.
For any sequence,
X = (
x1,
x2, …,
xn), assuming
W is a scoring matrix outputted by the BiLSTM model, with a size of
n ×
k, where n represents the number of words and
k represents the number of labels. Each element
Wij in the matrix represents the score for the
j-th label of the
i-th word. For a predicted sequence
Y = (
y1,
y2, …,
yn), the scoring function
s(
X,
Y) is calculated using Formula (5):
where
A represents the transition score matrix, and
Aij represents the transition score from label
i to label
j. The size of
A is
k + 2. The probability
p(
Y|
X) of the predicted sequence
Y given the input sequence
X is calculated using Formula (6).
After taking the logarithm on both sides of the equation, the log-likelihood function of the predicted sequence is obtained.
In the equation,
Y represents the correct labeled sequence,
YX represents all possible labeled sequences, and decoding refers to obtaining the output sequence
Y* with the maximum score.
3. Design of Engine Room Semantic Entity Recognition
This chapter describes the workflow for the engine room semantic entity recognition task. Based on this, the recognition process is divided into two main parts: data preprocessing and experimental analysis under different models. This article established a Chinese semantic dataset in the domain of marine engine room systems, which has advantages in terms of professionalism and relevance compared with other public datasets. Therefore, this dataset is precious for research on marine engine room semantic named entity recognition. The dataset was annotated using the BIO labeling scheme, and the evaluation of the models was based on the Precision (P), Recall (R), and F1-score (F1) metrics.
3.1. The NER Process
Based on the above, the workflow for named entity recognition tasks on marine engine room semantics can be summarized as shown in
Figure 6. The specific steps of the entity recognition task are as follows:
Step 1: Construct the marine engine room semantic dataset.
Step 2: Use appropriate text annotation software to annotate the text sets according to the BIO labeling scheme. Divide the dataset into training, validation, and testing sets according to a suitable ratio.
Step 3: Determine the model’s different parameter values and conduct multiple training rounds until the training requirements are met. Save the best-trained model.
Step 4: Use the saved best model to test the testing set, obtain the predicted label sequence, and complete the named entity recognition task on the testing texts.
This workflow outlines the key training steps and evaluates a model for named entity recognition on ship cabin training texts. It emphasizes data preparation, annotation, model training, and evaluation, ensuring the model is trained and tested on appropriate datasets to achieve accurate entity recognition results.
3.2. Dataset Annotation and Evaluation Indicators
There are two commonly named entity labeling schemes: BIO and BIEO. However, the performance differences between these two labeling schemes in named entity recognition are relatively small. Therefore, this article chooses the BIO scheme to annotate the engine room training text. In the BIO scheme, “B” represents the beginning of an entity, followed by the specific entity type, such as “B-SYS”; “I” represents the middle part of an entity, followed by the specific entity type, such as “I-SYS”; and “O” indicates that the word is not part of an entity.
Popular types of text annotation software mainly include BRAT, Doccano, YEDDA, and Chinese Annotator. YEDDA is an open-source text annotation software that has many advantages and overcomes the inefficiency of traditional text annotation tools. It annotates entities through the command line and shortcut keys and can configure custom labels for entities. Therefore, in the study of engine room semantic entity recognition, the YEDDA annotation software was used to annotate the semantic entities in the dataset. As shown in
Figure 7, entities were annotated, such as "海水系统” (seawater system), “主海水泵” (main seawater pump), and “起动” (start) in the sentences. The annotated text information was exported and stored in the “.anns” file format. The annotation results are shown in
Table 1.
The entity types annotated in the marine engine room training text corpus include three categories: SYS (system), EQU (equipment), and ACT (action). The entire dataset was divided into training, validation, and testing sets in an 8:1:1 ratio.
Table 2 displays the distribution of the three entity types in the training, validation, and testing sets. The specific examples of each entity type in the dataset are presented in
Table 3.
This paper adopts the NER evaluation metrics proposed in the MUC-2 conference [
23], which include
Precision (
P),
Recall (
R), and
F1
score (
F1) as the main evaluation metrics. A higher
F1
score indicates that the experimental method is more effective in performance evaluation.
In the formula, NumT represents the number of correctly predicted entities, and NumP stands for the count of entities in the predicted results. In contrast, NumR corresponds to the count of entities in the original annotations of the dataset.
4. Design of Engine Room Semantic Entity Recognition
This chapter’s workflow is designed for the engine room semantic entity recognition task. Based on this, the recognition process is divided into two main parts: data preprocessing and experiment.
4.1. Experimental Environment and Parameter Configuration
The experiment used the Pytorch learning framework to build the training model for the engine room semantic entity recognition task.
Table 4 displays the detailed configuration of the specific training environment.
The hyperparameter values were determined for the proposed model’s optimal performance during the experimental process. This study conducted the following comparative analysis and research. In each analysis study, the model only changed one hyperparameter; the performances are shown in
Figure 8.
According to the trend of curve changes in
Figure 8, the value of the
F1
score will fluctuate with the changes in hyperparameters. Therefore, the hyperparameter value is selected in the following experiment when the
F1
score reaches its maximum value.
This article used a pre-trained BERT model to represent the input text with vectors in the experiment. The BERT model has 12 layers, 12 attention heads, and outputs vectors of size 768. It has a total of 110 million parameters. The specific parameter settings for the BERT-BiLSTM-CRF model are shown in
Table 5.
4.2. Experimental Result
The changes in the
P-value,
R-value, and
F1-value of the BERT-BiLSTM-CRF model with the number of training epochs are shown in
Figure 9.
Figure 9a shows that the model significantly improves performance metrics during the first 15 epochs of training. This indicates that this study’s named entity recognition model learns quickly and performs well on the ship’s cabin training text data. After around 35 epochs, the metrics stabilize, and at the 42nd epoch, the
F1-value reaches its peak at 91.68%. After 50 epochs of training, the parameters stabilize and show minimal changes, indicating that the model exhibits good stability.
Comparing (a) and (b) in
Figure 8, it can be seen that the fluctuation in the test set curve is larger than that in the training set curve, and the optimal
F1
score is smaller.
The neural network model achieves the maximum
F1
score during multiple training epochs; the precision, recall, and
F1
score for recognizing the three types of entities in the cabin training text are shown in
Table 6.
From
Table 6, it can be observed that the BERT-BiLSTM-CRF training model can achieve the task of named entity recognition in marine engine room training text, with an overall F1 score of 91.68%. The overall classification performance is good. The F1 scores for EQU and ACT are higher than SYS among the three entity types. Analyzing the dataset reveals that EQU and ACT entities have fixed expressions and frequently occur, while SYS entities are relatively fewer in number and have more complex semantic expressions. For example, in the training process, the SYS entity “货油泵透平系统 (Cargo oil pump turbine system)” may be misidentified as “货油泵 (Cargo oil pump)” and “透平系统 (Turbine system)” due to interference from the EQU entity “cargo oil pump.” Therefore, compared with the other two entity types, the performance of SYS is slightly lower.
4.3. Validation of Model Improvement Effectiveness
To validate the model’s effectiveness proposed in this study, the BERT model was first compared with the improved model, focusing on extracting contextual semantic information, feature prediction, and the fusion of contextual semantic information and feature prediction.
(1) Validation of contextual semantic information extraction
An experiment was conducted to validate the effectiveness of using BiLSTM as the improvement method, and the results are shown in
Table 7. The comparison of results is illustrated in
Figure 10.
Compared with the BERT model, the BERT-BiLSTM model showed improvements in the accuracy of all three entity types, with an overall accuracy increase of 0.17%. Specifically, the recognition performance for SYS increased by 0.32%, EQU increased by 0.07%, and ACT increased by 0.12%. These results indicate that incorporating the BiLSTM module to optimize the BERT model further enhances the extraction of contextual features from the text, confirming the effectiveness of extracting contextual semantic information.
(2) Validation of feature prediction
An experiment was conducted to validate the effectiveness of using CRF as the improvement method, and the results are shown in
Table 8. The comparison of results is illustrated in
Figure 11.
Compared with the BERT model, the BERT-CRF model showed an overall accuracy improvement of 0.38%. Specifically, the recognition performance for SYS increased by 0.46%, EQU by 0.37%, and ACT by 0.32%. These results indicate that using CRF for predicting the entity recognition results in marine engine room training texts, as opposed to the softmax layer of the BERT model, enables capturing more comprehensive global information. This confirms the effectiveness of feature prediction.
On the other hand, when optimizing the model, the dataset’s complexity also affects the model’s performance. Reducing the complexity of the dataset may also improve the model’s accuracy [
24,
25].
(3) Ablation Experiments and Analysis
To investigate the contributions of each component in the proposed method, ablation experiments were conducted on the marine engine room semantic dataset. The comparative results are shown in
Table 9.
Table 9 shows that the BERT-BiLSTM model and BERT-CRF model, based on the BERT algorithm, improved the
F1
score of marine engine room named entity recognition by 0.17% and 0.38%, respectively. This indicates that both improvement approaches have enhanced the performance of named entity recognition, with the latter showing a better improvement. When the BERT-BiLSTM-CRF model, which combines both approaches, is compared with the basic BERT model, the F1 score is increased by 1.38%. Therefore, the combined model performs better in marine engine room named entity recognition than individual algorithmic models.
(4) Comparative Analysis of the Improved Models
The proposed model algorithm was compared with five mainstream named entity recognition algorithms: machine learning algorithms HMM and CRF, deep learning algorithms BiLSTM and BERT, and the algorithm that combines machine learning and deep learning, BiLSTM-CRF. The comparison results of various algorithms are shown in
Figure 12.
Figure 12, shows that in the entity recognition of marine engine room semantics, the three evaluation metrics of machine learning algorithms are generally lower than those of deep learning algorithms. The performance of the machine learning and deep learning fusion algorithm, BiLSTM-CRF, is slightly inferior to the BERT model. The experimental results are shown in
Table 10. According to
Table 10, it can be observed that in recognition of EQU and ACT entities, all types of algorithms perform better than SYS. This is because these two types of entities have clear semantic information, and the recognition process does not require excessive reliance on a large amount of contextual semantic information. Compared with other algorithms, the constructed BERT-BiLSTM-CRF in this paper performs best in recognizing all types of entities in the marine engine room semantic dataset, with
F1
scores of over 90% in all categories.
5. Conclusions
This paper addresses the problem of automatic extraction of named entities from marine engine room training texts in the automated construction phase of the marine engine room knowledge graph. It proposes a named entity recognition method based on BERT-BiLSTM-CRF. By leveraging the BERT pre-training model, the method tackles the issue of polysemy in text feature representation. It combines the advantages of the BiLSTM deep learning method in capturing contextual information and the CRF machine learning method in extracting globally optimal labeling sequences, thus obtaining marine engine room training entities. Experimental results demonstrate that this method outperforms common baseline algorithms in named entity recognition. It achieves Precision (P), Recall (R), and F1-score (F1) of 92.70%, 90.69%, and 91.68%, respectively, for the recognition of the three entity categories. The method completes the task of named entity recognition in marine engine room training texts and provides essential technical support for constructing the marine engine room knowledge graph.
In the future, the next step of our work is to apply the model to the entire engine room domain and further refine the entity types. For example, the next work plan is to divide the “Equipment (EQU)” category into subcategories such as “Main Engine Equipment (MEQU)” and “Auxiliary Engine Equipment (AEQU).” Additionally, the next work will continue to improve the model by incorporating domain-specific dictionary features, enhancing word analysis techniques, and expanding the marine engine room corpus. These efforts will help us achieve more accurate and comprehensive entity recognition in the marine engine room domain. At the same time, the article mainly conducts research and analysis based on the pre-training model of large-scale sample data. The amount of text data in the engine room is still lightweight. Later, research can be carried out on named-entity recognition of small sample sets, providing basic support for more efficient completion of entity relationship extraction and construction of a marine engine room knowledge graph.