Next Article in Journal
A Lightweight Convolutional Spiking Neural Network for Fires Detection Based on Acoustics
Previous Article in Journal
Real-Time Deepfake Video Detection Using Eye Movement Analysis with a Hybrid Deep Learning Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

SSuieBERT: Domain Adaptation Model for Chinese Space Science Text Mining and Information Extraction

1
Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
2
Key Laboratory of Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
3
University of Chinese Academy of Sciences, Beijing 100094, China
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(15), 2949; https://doi.org/10.3390/electronics13152949
Submission received: 13 June 2024 / Revised: 13 July 2024 / Accepted: 23 July 2024 / Published: 26 July 2024
(This article belongs to the Section Artificial Intelligence)

Abstract

With the continuous exploration of space science, a large number of domain-related materials and scientific literature are constantly generated, mostly in the form of text, which contains rich and unexplored domain knowledge. Natural language processing technology has made rapid development and pre-trained language models provide promising information extraction tools. However, due to the strong professionalism of space science, there are many domain concepts and technical terms. Moreover, Chinese texts have complex language structures and word combinations, which may yield suboptimal performance by general pre-trained models such as BERT. In this work, we investigate how to adapt BERT to Chinese space science and propose the space science-aware pre-trained language model, namely, SSuieBERT. We validate it through downstream tasks such as named entity recognition, relation extraction, and event extraction, which can perform better than general models. To the best of our knowledge, our proposed SSuieBERT is the first pre-trained language model in space science, which can promote information extraction and knowledge discovery from space science texts.
Keywords: Chinese space science; pre-trained language model; domain adaptation; natural language processing Chinese space science; pre-trained language model; domain adaptation; natural language processing

Share and Cite

MDPI and ACS Style

Liu, Y.; Li, S.; Deng, Y.; Hao, S.; Wang, L. SSuieBERT: Domain Adaptation Model for Chinese Space Science Text Mining and Information Extraction. Electronics 2024, 13, 2949. https://doi.org/10.3390/electronics13152949

AMA Style

Liu Y, Li S, Deng Y, Hao S, Wang L. SSuieBERT: Domain Adaptation Model for Chinese Space Science Text Mining and Information Extraction. Electronics. 2024; 13(15):2949. https://doi.org/10.3390/electronics13152949

Chicago/Turabian Style

Liu, Yunfei, Shengyang Li, Yunziwei Deng, Shiyi Hao, and Linjie Wang. 2024. "SSuieBERT: Domain Adaptation Model for Chinese Space Science Text Mining and Information Extraction" Electronics 13, no. 15: 2949. https://doi.org/10.3390/electronics13152949

APA Style

Liu, Y., Li, S., Deng, Y., Hao, S., & Wang, L. (2024). SSuieBERT: Domain Adaptation Model for Chinese Space Science Text Mining and Information Extraction. Electronics, 13(15), 2949. https://doi.org/10.3390/electronics13152949

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop