Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (1)

Search Parameters:
Keywords = pseudo-parallel sentence extraction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
11 pages, 1428 KiB  
Article
Chinese–Vietnamese Pseudo-Parallel Sentences Extraction Based on Image Information Fusion
by Yonghua Wen, Junjun Guo, Zhiqiang Yu and Zhengtao Yu
Information 2023, 14(5), 298; https://doi.org/10.3390/info14050298 - 21 May 2023
Cited by 1 | Viewed by 2107
Abstract
Parallel sentences play a crucial role in various NLP tasks, particularly for cross-lingual tasks such as machine translation. However, due to the time-consuming and laborious nature of manual construction, many low-resource languages still suffer from a lack of large-scale parallel data. The objective [...] Read more.
Parallel sentences play a crucial role in various NLP tasks, particularly for cross-lingual tasks such as machine translation. However, due to the time-consuming and laborious nature of manual construction, many low-resource languages still suffer from a lack of large-scale parallel data. The objective of pseudo-parallel sentence extraction is to automatically identify sentence pairs in different languages that convey similar meanings. Earlier methods heavily relied on parallel data, which is unsuitable for low-resource scenarios. The current mainstream research direction is to use transfer learning or unsupervised learning based on cross-lingual word embeddings and multilingual pre-trained models; however, these methods are ineffective for languages with substantial differences. To address this issue, we propose a sentence extraction method that leverages image information fusion to extract Chinese–Vietnamese pseudo-parallel sentences from collections of bilingual texts. Our method first employs an adaptive image and text feature fusion strategy to efficiently extract the bilingual parallel sentence pair, and then, a multimodal fusion method is presented to balance the information between the image and text modalities. The experiments on multiple benchmarks show that our method achieves promising results compared to a competitive baseline by infusing additional external image information. Full article
Show Figures

Figure 1

Back to TopTop