Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (37)

Search Parameters:
Keywords = linguistic channels

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 3330 KB  
Article
Revealing Short-Term Memory Communication Channels Embedded in Alphabetical Texts: Theory and Experiments
by Emilio Matricciani
Information 2025, 16(10), 847; https://doi.org/10.3390/info16100847 - 30 Sep 2025
Viewed by 439
Abstract
The aim of the present paper is to further develop a theory on the flow of linguistic variables making a sentence, namely, the transformation of (a) characters into words; (b) words into word intervals; and (c) word intervals into sentences. The relationship between [...] Read more.
The aim of the present paper is to further develop a theory on the flow of linguistic variables making a sentence, namely, the transformation of (a) characters into words; (b) words into word intervals; and (c) word intervals into sentences. The relationship between two linguistic variables is studied as a communication channel whose performance is determined by the slope of their regression line and by their correlation coefficient. The mathematical theory is applicable to any field/specialty in which a linear relationship holds between two variables. The signal-to-noise ratio Γ is a figure of merit of a channel being “deterministic”, i.e., a channel in which the scattering of the data around the regression line is negligible. The larger Γ is, the more the channel is “deterministic”. In conclusion, humans have invented codes whose sequences of symbols that make words cannot vary very much when indicating single physical or mental objects of their experience (larger Γ). On the contrary, large variability (smaller Γ) is achieved by introducing interpunctions to make word intervals, and word intervals make sentences that communicate concepts. This theory can inspire new research lines in cognitive science research. Full article
Show Figures

Figure 1

26 pages, 12107 KB  
Article
Empowering Older Migrants: Co-Designing Climate Communication with Chinese Seniors in the UK
by Qing Ni, Hua Dong and Antonios Kaniadakis
J. Ageing Longev. 2025, 5(4), 37; https://doi.org/10.3390/jal5040037 - 24 Sep 2025
Viewed by 395
Abstract
This study explores how older Chinese migrants in London engage with climate change discourse using participatory co-design workshops. Although already practising sustainability behaviours such as recycling, this group faces significant barriers—particularly language difficulties and cultural differences—that limit their active participation in broader climate [...] Read more.
This study explores how older Chinese migrants in London engage with climate change discourse using participatory co-design workshops. Although already practising sustainability behaviours such as recycling, this group faces significant barriers—particularly language difficulties and cultural differences—that limit their active participation in broader climate initiatives. The research addresses three key aspects: (1) identifying opportunities for sustainable practices within migrants’ daily routines; (2) understanding their influential roles within families and communities; and (3) examining their trusted sources and preferred channels for climate communication. Results highlight that family and community networks, combined with digital platforms (e.g., WeChat) and visually engaging materials, play essential roles in disseminating climate information. Participants expressed strong motivations rooted in intergenerational responsibility and economic benefits. The findings emphasise the necessity of inclusive and peer-led communication strategies that are attuned to older migrants’ linguistic preferences, media habits, and cultural values—underscoring their significant but often overlooked potential to meaningfully contribute to climate action. Full article
(This article belongs to the Special Issue Aging in Place: Supporting Older People's Well-Being and Independence)
Show Figures

Figure 1

16 pages, 406 KB  
Article
Anglicizing Humor in a Spanish Satirical TV Show—Pragmatic Functions and Discourse Strategies
by María-Isabel González-Cruz
Languages 2025, 10(9), 230; https://doi.org/10.3390/languages10090230 - 10 Sep 2025
Viewed by 1086
Abstract
Humor is a pragmatic and interdisciplinary phenomenon whose sociocultural relevance has been increasingly recognized by the Academia. Surprisingly, although the anthropo-philosophical theory of homo risu emerged in the 7th century, linguists became interested in the study of the linguistic mechanisms of humor only [...] Read more.
Humor is a pragmatic and interdisciplinary phenomenon whose sociocultural relevance has been increasingly recognized by the Academia. Surprisingly, although the anthropo-philosophical theory of homo risu emerged in the 7th century, linguists became interested in the study of the linguistic mechanisms of humor only a few years ago. One of those mechanisms is the use of Anglicisms, because of their pragmatic potential to provide some added value, a halo of prestige and modernity, which creates playful effects of complicity. This paper examines the way Anglicisms crucially contribute to the humorous discourse of the satirical news show El Intermedio, the longest-running program on a Spanish private TV channel. Monitoring of 300 episodes broadcast between April 2022 and December 2024 proves how, in addition to puns and irony, scriptwriters tend to resort to a number of strategies involving the creative use of Anglicisms, which perform different pragmatic functions, while showing sociolinguistic awareness. They also offer an up-to-date sample of the great vitality of Anglicisms in contemporary Spain. Full article
(This article belongs to the Special Issue Exploring Pragmatics in Contemporary Cross-Cultural Contexts)
Show Figures

Graph 1

16 pages, 74973 KB  
Article
TVI-MFAN: A Text–Visual Interaction Multilevel Feature Alignment Network for Visual Grounding in Remote Sensing
by Hao Chi, Weiwei Qin, Xingyu Chen, Wenxin Guo and Baiwei An
Remote Sens. 2025, 17(17), 2993; https://doi.org/10.3390/rs17172993 - 28 Aug 2025
Viewed by 713
Abstract
Visual grounding for remote sensing (RSVG) focuses on localizing specific objects in remote sensing (RS) imagery based on linguistic expressions. Existing methods typically employ pre-trained models to locate the referenced objects. However, due to the insufficient capability of cross-modal interaction and alignment, the [...] Read more.
Visual grounding for remote sensing (RSVG) focuses on localizing specific objects in remote sensing (RS) imagery based on linguistic expressions. Existing methods typically employ pre-trained models to locate the referenced objects. However, due to the insufficient capability of cross-modal interaction and alignment, the extracted visual features may suffer from semantic drift, limiting the performance of RSVG. To address this, the article introduces a novel RSVG framework named the text–visual interaction multilevel feature alignment network (TVI-MFAN), which leverages a text–visual interaction attention (TVIA) module to dynamically generate adaptive weights and biases at both spatial and channel dimensions, enabling the visual feature to focus on relevant linguistic expressions. Additionally, a multilevel feature alignment network (MFAN) aggregates contextual information by using cross-modal alignment to enhance features and suppress irrelevant regions. Experiments demonstrate that the proposed method achieves 75.65% and 80.24% (2.42% and 3.1% absolute improvement) accuracy on the OPT-RSVG and DIOR-RSVG dataset, validating its effectiveness. Full article
Show Figures

Figure 1

20 pages, 2026 KB  
Article
Synonym Substitution Steganalysis Based on Heterogeneous Feature Extraction and Hard Sample Mining Re-Perception
by Jingang Wang, Hui Du and Peng Liu
Big Data Cogn. Comput. 2025, 9(8), 192; https://doi.org/10.3390/bdcc9080192 - 22 Jul 2025
Viewed by 830
Abstract
Linguistic steganography can be utilized to establish covert communication channels on social media platforms, thus facilitating the dissemination of illegal messages, seriously compromising cyberspace security. Synonym substitution-based linguistic steganography methods have garnered considerable attention due to their simplicity and strong imperceptibility. Existing linguistic [...] Read more.
Linguistic steganography can be utilized to establish covert communication channels on social media platforms, thus facilitating the dissemination of illegal messages, seriously compromising cyberspace security. Synonym substitution-based linguistic steganography methods have garnered considerable attention due to their simplicity and strong imperceptibility. Existing linguistic steganalysis methods have not achieved excellent detection performance for the aforementioned type of linguistic steganography. In this paper, based on the idea of focusing on accumulated differences, we propose a two-stage synonym substitution-based linguistic steganalysis method that does not require a synonym database and can effectively detect texts with very low embedding rates. Experimental results demonstrate that this method achieves an average detection accuracy 2.4% higher than the comparative method. Full article
Show Figures

Figure 1

19 pages, 914 KB  
Article
RU-OLD: A Comprehensive Analysis of Offensive Language Detection in Roman Urdu Using Hybrid Machine Learning, Deep Learning, and Transformer Models
by Muhammad Zain, Nisar Hussain, Amna Qasim, Gull Mehak, Fiaz Ahmad, Grigori Sidorov and Alexander Gelbukh
Algorithms 2025, 18(7), 396; https://doi.org/10.3390/a18070396 - 28 Jun 2025
Cited by 2 | Viewed by 1003
Abstract
The detection of abusive language in Roman Urdu is important for secure digital interaction. This work investigates machine learning (ML), deep learning (DL), and transformer-based methods for detecting offensive language in Roman Urdu comments collected from YouTube news channels. Extracted features use TF-IDF [...] Read more.
The detection of abusive language in Roman Urdu is important for secure digital interaction. This work investigates machine learning (ML), deep learning (DL), and transformer-based methods for detecting offensive language in Roman Urdu comments collected from YouTube news channels. Extracted features use TF-IDF and Count Vectorizer for unigrams, bigrams, and trigrams. Of all the ML models—Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Naïve Bayes (NB)—the best performance was achieved by the same SVM. DL models involved evaluating Bi-LSTM and CNN models, where the CNN model outperformed the others. Moreover, transformer variants such as LLaMA 2 and ModernBERT (MBERT) were instantiated and fine-tuned with LoRA (Low-Rank Adaptation) for better efficiency. LoRA has been tuned for large language models (LLMs), a family of advanced machine learning frameworks, based on the principle of making the process efficient with extremely low computational cost with better enhancement. According to the experimental results, LLaMA 2 with LoRA attained the highest F1-score of 96.58%, greatly exceeding the performance of other approaches. To elaborate, LoRA-optimized transformers perform well in capturing detailed subtleties of linguistic nuances, lending themselves well to Roman Urdu offensive language detection. The study compares the performance of conventional and contemporary NLP methods, highlighting the relevance of effective fine-tuning methods. Our findings pave the way for scalable and accurate automated moderation systems for online platforms supporting multiple languages. Full article
(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)
Show Figures

Figure 1

15 pages, 1258 KB  
Article
Are Children Sensitive to Ironic Prosody? A Novel Task to Settle the Issue
by Francesca Panzeri and Beatrice Giustolisi
Languages 2025, 10(7), 152; https://doi.org/10.3390/languages10070152 - 25 Jun 2025
Viewed by 973
Abstract
Ironic remarks are often pronounced with a distinctive intonation. It is not clear whether children rely on acoustic cues to attribute an ironic intent. This question has been only indirectly tackled, with studies that manipulated the intonation with which the final remark is [...] Read more.
Ironic remarks are often pronounced with a distinctive intonation. It is not clear whether children rely on acoustic cues to attribute an ironic intent. This question has been only indirectly tackled, with studies that manipulated the intonation with which the final remark is pronounced within an irony comprehension task. We propose a new task that is meant to assess whether children rely on prosody to infer speakers’ sincere or ironic communicative intentions, without requiring meta-linguistic judgments (since pragmatic awareness is challenging for young children). Children listen to evaluative remarks (e.g., “That house is really beautiful”), pronounced with sincere or ironic intonation, and they are asked to identify what the speaker is referring to by selecting one of two pictures depicting an image corresponding to a literal interpretation (a luxury house) and one to its reverse interpretation (a hovel). We tested eighty children aged 3 to 11 years and found a clear developmental trend, with children consistently responding above the chance level from age seven, and there was no correlation with the recognition of emotions transmitted through the vocal channel. Full article
(This article belongs to the Special Issue Advances in the Acquisition of Prosody)
Show Figures

Figure 1

19 pages, 1823 KB  
Review
A Bibliometric Analysis and Visualization of In-Vehicle Communication Protocols
by Iftikhar Hussain, Manuel J. C. S. Reis, Carlos Serôdio and Frederico Branco
Future Internet 2025, 17(6), 268; https://doi.org/10.3390/fi17060268 - 19 Jun 2025
Viewed by 1169
Abstract
This research examined the domain of intelligent transportation systems (ITS) by analyzing the impact of scholarly work and thematic prevalence, as well as focusing attention on vehicles, their technologies, cybersecurity, and related scholarly technologies. This was performed by examining the scientific literature indexed [...] Read more.
This research examined the domain of intelligent transportation systems (ITS) by analyzing the impact of scholarly work and thematic prevalence, as well as focusing attention on vehicles, their technologies, cybersecurity, and related scholarly technologies. This was performed by examining the scientific literature indexed in the Scopus database. This study analysed 2919 documents published between 2018 and 2025. The findings indicated that the highest and most significant journal was derived from IEEE Transactions on Vehicular Technology, with significant standing to the growth of communication and computing on vehicles with edge computing and AI optimization of vehicular systems. In addition, important PST research conferences highlighted the growing interest in academic research in cybersecurity for vehicle networks. Sensor networks, pose forensics, and privacy-preserving communication frameworks were some of the significant contributing fields marking the significance of the interdisciplinary nature of this research. Employing bibliometric analysis, the literature illustrated the multiple channels integrating knowledge creation and innovation in ITS through citation analysis. The outcome suggested an increasingly sophisticated research area, weighing technical progress and increasing concern about security and privacy measures. Further studies must investigate edge computing integrated with AI, advanced privacy-preserving linguistic protocols, and new vehicular network intrusion detection systems. Full article
Show Figures

Figure 1

28 pages, 315 KB  
Article
Mapping Extent of Spillover Channels in Monetary Space: Study of Multidimensional Spatial Effects of US Dollar Liquidity
by Changrong Lu, Lian Liu, Fandi Yu, Jiaxiang Li and Guanghong Zheng
Int. J. Financial Stud. 2025, 13(2), 72; https://doi.org/10.3390/ijfs13020072 - 1 May 2025
Cited by 1 | Viewed by 912
Abstract
This study aims to analyze the spatial effects triggered by dollar liquidity by constructing a multidimensional spatial matrix that modifies the traditional monetary spatial framework. We utilized a three-level spatial econometric model (Spatial Lag, Durbin, and Generalized Nested Space) to measure Gross Domestic [...] Read more.
This study aims to analyze the spatial effects triggered by dollar liquidity by constructing a multidimensional spatial matrix that modifies the traditional monetary spatial framework. We utilized a three-level spatial econometric model (Spatial Lag, Durbin, and Generalized Nested Space) to measure Gross Domestic Product (GDP), Consumer Price Index (CPI), and Asset Price Bubbles (BBL) through five spillover channels (geography, linguistics, politics, war, and economy). Our aim is to establish a systematic relationship between the conduction mechanism, means, economic indicators, and dollar externalities to examine liquidity spillover effects at varying distances in the global monetary space. We find that the spatial effects induced by the global circulation of the US dollar behave significantly differently in a single matrix space compared to in a multidimensional space. While the model verifies the existence of a positive correlation between the complexity of a single space and the spillover effect from a conduction mechanism perspective, the measure of the multidimensional matrix shows that the significance of the spillover effect weakens with an increase in abstraction level from a conduction means perspective. It suggests that spatial matrices of different dimensions reflect different economic realities. The former shows hierarchical multivariate details in independent matrices, while the variation in the level of abstraction of matrices of different dimensions in the latter enhances their interactivity and complexity. Full article
29 pages, 3481 KB  
Article
Translation Can Distort the Linguistic Parameters of Source Texts Written in Inflected Language: Multidimensional Mathematical Analysis of “The Betrothed”, a Translation in English of “I Promessi Sposi” by A. Manzoni
by Emilio Matricciani
AppliedMath 2025, 5(1), 24; https://doi.org/10.3390/appliedmath5010024 - 4 Mar 2025
Cited by 1 | Viewed by 2407
Abstract
We compare, mathematically, the text of a famous Italian novel, I promessi sposi, written by Alessandro Manzoni (source text), to its most recent English translation, The Betrothed by Michael F. Moore (target text). The mathematical theory applied does not measure the efficacy [...] Read more.
We compare, mathematically, the text of a famous Italian novel, I promessi sposi, written by Alessandro Manzoni (source text), to its most recent English translation, The Betrothed by Michael F. Moore (target text). The mathematical theory applied does not measure the efficacy and beauty of texts; only their mathematical underlying structure and similarity. The translation theory adopted by the translator is the “domestication” of the source text because English is not as economical in its use of subject pronouns as Italian. A domestication index measures the degree of domestication. The modification of the original mathematical structure produces several consequences on the short–term memory buffers required for the reader and on the theoretical number of patterns used to construct sentences. The geometrical representation of texts and the related probability of error indicate that the two texts are practically uncorrelated. A fine–tuning analysis shows that linguistic channels are very noisy, with very poor signal–to–noise ratios, except the channels related to characters and words. Readability indices are also diverse. In conclusion, a blind comparison of the linguistic parameters of the two texts would unlikely indicate they refer to the same novel. Full article
Show Figures

Figure 1

32 pages, 1379 KB  
Article
Multi-Criteria Decision Analysis for Sustainable Medicinal Supply Chain Problems with Adaptability and Challenges Issues
by Alaa Fouad Momena, Kamal Hossain Gazi and Sankar Prasad Mondal
Logistics 2025, 9(1), 31; https://doi.org/10.3390/logistics9010031 - 14 Feb 2025
Cited by 3 | Viewed by 2004
Abstract
Background: The supply chain refers to the full process of creating and providing a good or service, starting with the raw materials and ending with the final customer. It requires cooperation and coordination between many parties, including the suppliers, manufacturers, distributors, retailers, and [...] Read more.
Background: The supply chain refers to the full process of creating and providing a good or service, starting with the raw materials and ending with the final customer. It requires cooperation and coordination between many parties, including the suppliers, manufacturers, distributors, retailers, and customers. Methods: In the medicinal supply chain (MSC), the critical nature of these processes becomes more complicated. It requires strict regulation, quality control, and traceability to ensure patient safety and compliance with regulatory standards. This study is conducted to suggest a smooth channel to deal with the challenges and adaptability of the MSC. Different MSC challenges are considered as criteria which deal with various adaptation plans. Multi-criteria decision-making (MCDM) methodologies are taken as optimization tools and probabilistic linguistic term sets (PLTSs) are considered for express uncertainty. Results: The subscript degree function (SDF) and deviation degree function (DDF) are introduced to evaluate the crisp value of the PLTSs. An MSC model is constructed to optimize the sustainable medicinal supply chain and overcome various barriers to MSC problems. Conclusions: Additionally, sensitivity analysis and comparative analysis were conducted to check the robustness and flexibility of the system. Finally, the conclusion section determines the optimal weighted criteria for the MSC problem and identifies the best possible solutions for MSC using PLTS-based MCDM methodologies. Full article
Show Figures

Figure 1

29 pages, 3822 KB  
Article
A Fuzzy Logic Technique for the Environmental Impact Assessment of Marine Renewable Energy Power Plants
by Pamela Flores and Edgar Mendoza
Energies 2025, 18(2), 272; https://doi.org/10.3390/en18020272 - 9 Jan 2025
Cited by 2 | Viewed by 1676
Abstract
The application of fuzzy logic to environmental impact assessment (EIA) provides a robust method to address uncertainties and subjectivities inherent in evaluating complex environmental systems. This is particularly relevant in ocean renewable energy projects, where predicting environmental impacts is challenging due to the [...] Read more.
The application of fuzzy logic to environmental impact assessment (EIA) provides a robust method to address uncertainties and subjectivities inherent in evaluating complex environmental systems. This is particularly relevant in ocean renewable energy projects, where predicting environmental impacts is challenging due to the dynamic nature of marine environments. We conducted a comprehensive literature review to identify the types of impacts currently being investigated, assessed, and monitored in existing marine energy conversion projects. Based on these foundations, we developed both traditional and fuzzy mythologies for EIA. The fuzzy logic methodology approach allows for the incorporation of uncertainties into the assessment process, converting qualitative assessments into quantifiable data and linguistic levels and enhancing decision-making accuracy. We tested this fuzzy methodology across four types of ocean energy devices: floating, submerged, fixed to the ocean floor, and onshore. Finally, we applied the methodology to the EIA of a marine energy project in the Cozumel Channel, Quintana Roo, Mexico. The results demonstrate that fuzzy logic provides a more flexible and reliable evaluation of environmental impacts, contributing to more effective environmental management and sustainable development in marine renewable energy contexts. Full article
Show Figures

Figure 1

20 pages, 1350 KB  
Article
Textual Attributes of Corporate Sustainability Reports and ESG Ratings
by Jie Huang, Derek D. Wang and Yiying Wang
Sustainability 2024, 16(21), 9270; https://doi.org/10.3390/su16219270 - 25 Oct 2024
Cited by 6 | Viewed by 8159
Abstract
While the textual attributes of corporate financial documents, such as annual reports, have been extensively analyzed in the academic literature, those of corporate sustainability reports, which serve as a critical channel for nonfinancial disclosure, are relatively under-explored. Given the increasing importance of Environmental, [...] Read more.
While the textual attributes of corporate financial documents, such as annual reports, have been extensively analyzed in the academic literature, those of corporate sustainability reports, which serve as a critical channel for nonfinancial disclosure, are relatively under-explored. Given the increasing importance of Environmental, Social, and Governance (ESG) factors in corporate strategy and stakeholder evaluation, understanding the role of textual attributes in sustainability reporting is crucial. This study examines 10,021 hand-collected sustainability reports from Chinese firms between 2009 and 2021, focusing on six key textual attributes: length, readability, tone, boilerplate language, redundancy, and completeness. Using computational linguistics, we analyze how these attributes evolve over time and their impact on ESG ratings provided by both international (MSCI, FTSE) and domestic (SNSI) agencies. Our findings reveal that the length and completeness of sustainability reports significantly influence ESG scores across agencies, demonstrating a shared appreciation for detailed and transparent disclosures. However, international and domestic rating agencies exhibit differing responses to attributes like tone, boilerplate language, and redundancy. These differences highlight variations in evaluation standards, methodologies, and value orientations between global and local stakeholders. The results emphasize the need for firms to tailor their sustainability disclosures to meet diverse stakeholder expectations. This study contributes to the growing body of literature on nonfinancial reporting by providing empirical evidence on how specific textual characteristics of sustainability reports can shape ESG evaluations, offering insights for both corporate communicators and policymakers. Full article
(This article belongs to the Special Issue Sustainable Governance: ESG Practices in the Modern Corporation)
Show Figures

Figure 1

30 pages, 720 KB  
Review
Inclusive Crisis Communication in a Pandemic Context: A Rapid Review
by Karin Hannes, Pieter Thyssen, Theresa Bengough, Shoba Dawson, Kristel Paque, Sarah Talboom, Krizia Tuand, Thomas Vandendriessche, Wessel van de Veerdonk, Daniëlle Wopereis and Anne-Mieke Vandamme
Int. J. Environ. Res. Public Health 2024, 21(9), 1216; https://doi.org/10.3390/ijerph21091216 - 16 Sep 2024
Cited by 1 | Viewed by 3010
Abstract
Background: Crisis communication might not reach non-native speakers or persons with low literacy levels, a low socio-economic status, and/or an auditory or visual impairments as easily as it would reach other citizens. The aim of this rapid review was to synthesize the evidence [...] Read more.
Background: Crisis communication might not reach non-native speakers or persons with low literacy levels, a low socio-economic status, and/or an auditory or visual impairments as easily as it would reach other citizens. The aim of this rapid review was to synthesize the evidence on strategies used to improve inclusive pandemic-related crisis communication in terms of form, channel, and outreach. Methods: After a comprehensive search and a rigorous screening and quality assessment exercise, twelve comparative studies were selected for inclusion in this review. Data were analyzed and represented by means of a structured reporting of available effects using narrative tables. Results: The findings indicate that a higher message frequency (on any channel) may lead to a lower recall rate, audio–visual productions and tailored messages prove to be valuable under certain conditions, and primary healthcare practitioners appear to be the most trusted source of information for most groups of citizens. Trust levels were higher for citizens who were notified in advance of potential exceptions to the rule in the effect of preventive and curative measures promoted. Conclusions: This review contributes to combatting information inequality by providing evidence on how to remove the sensorial, linguistic, cultural, and textual barriers experienced by minorities and other underserved target audiences in COVID-19-related governmental crisis communication in response to the societal, health-related costs of ineffective communication outreach. Full article
(This article belongs to the Special Issue Health Literacy and Communicable Diseases)
Show Figures

Figure 1

11 pages, 1212 KB  
Article
Building a Speech Dataset and Recognition Model for the Minority Tu Language
by Shasha Kong, Chunmei Li, Chengwu Fang and Peng Yang
Appl. Sci. 2024, 14(15), 6795; https://doi.org/10.3390/app14156795 - 4 Aug 2024
Cited by 2 | Viewed by 1583
Abstract
Speech recognition technology has many applications in our daily life. However, for many low-resource languages without written forms, acquiring sufficient training data remains a significant challenge for building accurate ASR models. The Tu language, spoken by an ethnic minority group in Qinghai Province [...] Read more.
Speech recognition technology has many applications in our daily life. However, for many low-resource languages without written forms, acquiring sufficient training data remains a significant challenge for building accurate ASR models. The Tu language, spoken by an ethnic minority group in Qinghai Province in China, is one such example. Due to the lack of written records and the great diversity in regional pronunciations, there has been little previous research on Tu-language speech recognition. This work seeks to address this research gap by creating the first speech dataset for the Tu language spoken in Huzhu County, Qinghai. We first formulated the relevant pronunciation rules for the Tu language based on linguistic analysis. Then, we constructed a new speech corpus, named HZ-TuDs, through targeted data collection and annotation. Based on the HZ-TuDs dataset, we designed several baseline sequence-to-sequence deep neural models for end-to-end Tu-language speech recognition. Additionally, we proposed a novel SA-conformer model, which combines convolutional and channel attention modules to better extract speech features. Experiments showed that our proposed SA-conformer model can significantly reduce the character error rate from 23% to 12%, effectively improving the accuracy of Tu language recognition compared to previous approaches. This demonstrates the effectiveness of our dataset construction and model design efforts in advancing speech recognition technology for this low-resource minority language. Full article
Show Figures

Figure 1

Back to TopTop