Next Article in Journal
Advanced Air Mobility: Opportunities and Challenges Deploying eVTOLs for Air Ambulance Service
Previous Article in Journal
Crystal Structure, Electrical Conductivity and Hydration of the Novel Oxygen-Deficient Perovskite La2ScZnO5.5, Doped with MgO and CaO
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM

1
Xinjiang Multilingual Information Technology Laboratory, Xinjiang Multilingual Information Technology Research Center, School of Software, Xinjiang University, Urumqi 832001, China
2
College of Information Science and Engineering, Xinjiang University, Urumqi 832001, China
3
National Engineering Research Center for Public Safety Risk Perception and Control by Big Data (RPP), China Academic of Electronics and Information Technology, Beijing 100041, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2022, 12(3), 1182; https://doi.org/10.3390/app12031182
Submission received: 23 November 2021 / Revised: 8 January 2022 / Accepted: 20 January 2022 / Published: 24 January 2022
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur texts has great research and application value in online public opinion. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to get high performance. However, there is minimal annotated data available about Uyghur sentiment analysis tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for sentiment analysis tasks: using the pre-trained language model with BiLSTM layer. Firstly, data augmentation is carried out by AEDA (An Easier Data Augmentation), and the augmented dataset is constructed to improve the performance of text classification tasks. Then, a pretraining model LaBSE is used to encode the input data. Then, BiLSTM is used to learn more context information. Finally, the validity of the model is verified via two categories datasets for sentiment analysis and five categories datasets for emotion analysis. We evaluated our approach on two datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the resources for sentiment analysis tasks and some of the open research questions. Therefore, we propose a combined deep learning and cross-language pretraining model for two low resource expectations.
Keywords: sentiment analysis; cross-lingual pre-trained language model; low-resource; BiLSTM; data augmentation sentiment analysis; cross-lingual pre-trained language model; low-resource; BiLSTM; data augmentation

Share and Cite

MDPI and ACS Style

Pei, Y.; Chen, S.; Ke, Z.; Silamu, W.; Guo, Q. AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM. Appl. Sci. 2022, 12, 1182. https://doi.org/10.3390/app12031182

AMA Style

Pei Y, Chen S, Ke Z, Silamu W, Guo Q. AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM. Applied Sciences. 2022; 12(3):1182. https://doi.org/10.3390/app12031182

Chicago/Turabian Style

Pei, Yijie, Siqi Chen, Zunwang Ke, Wushour Silamu, and Qinglang Guo. 2022. "AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM" Applied Sciences 12, no. 3: 1182. https://doi.org/10.3390/app12031182

APA Style

Pei, Y., Chen, S., Ke, Z., Silamu, W., & Guo, Q. (2022). AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM. Applied Sciences, 12(3), 1182. https://doi.org/10.3390/app12031182

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop