Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (201)

Search Parameters:
Keywords = Russian language

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
32 pages, 604 KB  
Article
Translation and Power in Georgia: Postcolonial Trajectories from Socialist Realism to Post-Soviet Market Pressures
by Gül Mükerrem Öztürk
Humanities 2025, 14(9), 174; https://doi.org/10.3390/h14090174 (registering DOI) - 25 Aug 2025
Abstract
This study examines the transformation of literary translation practices in Georgia from the Soviet era to the post-Soviet and neoliberal periods, using postcolonial translation theory as the main analytical lens. Translation is treated not merely as a linguistic transfer but as a process [...] Read more.
This study examines the transformation of literary translation practices in Georgia from the Soviet era to the post-Soviet and neoliberal periods, using postcolonial translation theory as the main analytical lens. Translation is treated not merely as a linguistic transfer but as a process shaped by ideological control, cultural representation, and global power hierarchies. In the Soviet era, censorship policies rooted in socialist realism imposed direct ideological interventions; children’s literature such as Maya the Bee and Bambi exemplified how religious or individualist themes were replaced with collectivist narratives. In the post-Soviet period, overt censorship has largely disappeared; however, structural factors—including the absence of a coherent national translation policy, economic precarity, and dependence on Western funding—have become decisive in shaping translation choices. The shift from Russian to English as the dominant source language has introduced new symbolic hierarchies, privileging Anglophone literature while marginalizing regional and non-Western voices. Drawing on the Georgian Book Market Research 2013–2015 alongside archival materials, paratextual analysis, and contemporary case studies, including the Georgian translation of André Aciman’s Call Me By Your Name, the study shows how translators negotiate between market expectations, cultural taboos, and ethical responsibility. It argues that translation in Georgia remains a contested site of cultural negotiation and epistemic justice. Full article
14 pages, 1112 KB  
Article
A Kalmyk Pilgrim in the Biography of the Dalai Lama: Baaza Bagshi’s Journey to Tibet as Seen from Both Sides
by Bembya Mitruev
Religions 2025, 16(8), 1085; https://doi.org/10.3390/rel16081085 - 21 Aug 2025
Viewed by 230
Abstract
Alongside historical narratives, there exists, in Old Kalmyk literature, a lesser-known corpus of travel writing that documents pilgrimages to major religious and political centers such as China, Tibet, and Mongolia. One notable and extant example of this genre is the travel account of [...] Read more.
Alongside historical narratives, there exists, in Old Kalmyk literature, a lesser-known corpus of travel writing that documents pilgrimages to major religious and political centers such as China, Tibet, and Mongolia. One notable and extant example of this genre is the travel account of Baaza Menkedjuev, a Gelung from the Maloderbetovskiy Ulus, more widely known as Baaza Bagshi. His first-person narrative was translated into Russian by A. M. Pozdneev in 1897 under the name “Skazanie o khozhdenii v Tibetskuiu stranu malo-dörbötskago Baaza Bagshi” [Narrative of the travel to Tibet by the Maloderbet Baaza Bagshi] and offers valuable ethnographic insights into a Kalmyk pilgrim’s journey to Tibet in the late 19th and early 20th centuries. Until recently, scholarship on Baaza Bagshi’s Tibetan sojourn has been confined to his own account, with no corroborating evidence found in Tibetan-language sources. This study addresses that lacuna by examining references to Baaza Bagshi in the Tibetan-language biography (Tib. rnam thar) of the 13th Dalai Lama, Thubten Gyatso (Tib. thub bstan rgya mtsho, 1876–1933). The significance of these references lies not only in the information provided about the number of audiences with the Dalai Lama Baaza Bagshi received, the dates of his visits, and the content of their meetings, but also in the fact that they demonstrate how the Kalmyks—despite living in the European part of Russia, the furthest from the Mongolian Buddhist world—did not lose their religious ties with Tibet. The corroboration of Baaza Bagshi’s visit in both Kalmyk and Tibetan sources allows for a more integrated understanding of Kalmyk–Tibetan relations and contributes to the study of interregional Buddhist networks. Methods of historical contextualization, historiographical critique, and comparative source analysis were used for this research. Full article
(This article belongs to the Special Issue Tibet-Mongol Buddhism Studies)
Show Figures

Figure 1

20 pages, 2833 KB  
Article
A Multi-Level Annotation Model for Fake News Detection: Implementing Kazakh-Russian Corpus via Label Studio
by Madina Sambetbayeva, Anargul Nekessova, Aigerim Yerimbetova, Abdygalym Bayangali, Mira Kaldarova, Duman Telman and Nurzhigit Smailov
Big Data Cogn. Comput. 2025, 9(8), 215; https://doi.org/10.3390/bdcc9080215 - 20 Aug 2025
Viewed by 217
Abstract
This paper presents a multi-level annotation model for detecting fake news in Kazakh and Russian languages, aiming to enhance understanding of disinformation strategies in multilingual digital media environments. Unlike traditional binary models, our approach captures the complexity of disinformation by accounting for both [...] Read more.
This paper presents a multi-level annotation model for detecting fake news in Kazakh and Russian languages, aiming to enhance understanding of disinformation strategies in multilingual digital media environments. Unlike traditional binary models, our approach captures the complexity of disinformation by accounting for both linguistic and cultural factors. To support this, a corpus of over 5000 news texts was manually annotated using the Label Studio platform. The annotation scheme consists of seven interrelated categories: CLAIM, SOURCE, EVIDENCE, DISINFORMATION_TECHNIQUE, AUTHOR_INTENT, TARGET_AUDIENCE, and TIMESTAMP. Inter-annotator agreement, evaluated using Cohen’s Kappa, ranged from 0.72 to 0.81, indicating substantial consistency. The annotated data reveals recurring patterns of disinformation, such as emotional manipulation, targeting of vulnerable individuals, and the strategic concealment of intent. Semantic relations between entities, such as CLAIM → EVIDENCE and CLAIM → AUTHOR_INTENT were formalized to represent disinformation narratives as knowledge graphs. This study contributes the first linguistically and culturally adapted annotation model for Kazakh and Russian languages, providing a robust and empirical resource for building interpretable and context-aware fake news detection systems. The resulting annotated corpus and its semantic structure offer valuable empirical material for further research in natural language processing, computational linguistics, and media studies in low-resource language environments. Full article
Show Figures

Figure 1

18 pages, 319 KB  
Article
Information Extraction from Multi-Domain Scientific Documents: Methods and Insights
by Tatiana Batura, Aigerim Yerimbetova, Nurzhan Mukazhanov, Nikita Shvarts, Bakzhan Sakenov and Mussa Turdalyuly
Appl. Sci. 2025, 15(16), 9086; https://doi.org/10.3390/app15169086 - 18 Aug 2025
Viewed by 248
Abstract
The rapid growth of scientific literature necessitates effective information extraction. However, existing methods face significant challenges, particularly when applied to multi-domain documents and low-resource languages. For Kazakh and Russian, there is a notable lack of annotated corpora and dedicated tools for scientific information [...] Read more.
The rapid growth of scientific literature necessitates effective information extraction. However, existing methods face significant challenges, particularly when applied to multi-domain documents and low-resource languages. For Kazakh and Russian, there is a notable lack of annotated corpora and dedicated tools for scientific information extraction. To address this gap, we introduce SciMDIX (Scientific Multi-Domain Information extraction), a novel multi-domain dataset of scientific documents in Russian and Kazakh, annotated with entities and relations. Our study includes a comprehensive evaluation of entity recognition performance, comparing state-of-the-art models, such as BERT, LLaMA, GLiNER, and spaCy across four diverse domains (IT, Linguistics, Medicine, and Psychology) in both languages. The findings highlight the promise of spaCy and GLiNER for practical deployment in under-resourced language settings. Furthermore, we propose a new zero-shot relation extraction model that leverages a multimodal representation by integrating sentence context, entity mentions, and textual definitions of relation classes. Our model can predict semantic relations between entities in new documents, even for a language encountered during training. This capability is especially valuable for low-resource language scenarios. Full article
Show Figures

Figure 1

37 pages, 5086 KB  
Article
Global Embeddings, Local Signals: Zero-Shot Sentiment Analysis of Transport Complaints
by Aliya Nugumanova, Daniyar Rakhimzhanov and Aiganym Mansurova
Informatics 2025, 12(3), 82; https://doi.org/10.3390/informatics12030082 - 14 Aug 2025
Viewed by 500
Abstract
Public transport agencies must triage thousands of multilingual complaints every day, yet the cost of training and serving fine-grained sentiment analysis models limits real-time deployment. The proposed “one encoder, any facet” framework therefore offers a reproducible, resource-efficient alternative to heavy fine-tuning for domain-specific [...] Read more.
Public transport agencies must triage thousands of multilingual complaints every day, yet the cost of training and serving fine-grained sentiment analysis models limits real-time deployment. The proposed “one encoder, any facet” framework therefore offers a reproducible, resource-efficient alternative to heavy fine-tuning for domain-specific sentiment analysis or opinion mining tasks on digital service data. To the best of our knowledge, we are the first to test this paradigm on operational multilingual complaints, where public transport agencies must prioritize thousands of Russian- and Kazakh-language messages each day. A human-labelled corpus of 2400 complaints is embedded with five open-source universal models. Obtained embeddings are matched to semantic “anchor” queries that describe three distinct facets: service aspect (eight classes), implicit frustration, and explicit customer request. In the strict zero-shot setting, the best encoder reaches 77% accuracy for aspect detection, 74% for frustration, and 80% for request; taken together, these signals reproduce human four-level priority in 60% of cases. Attaching a single-layer logistic probe on top of the frozen embeddings boosts performance to 89% for aspect, 83–87% for the binary facets, and 72% for end-to-end triage. Compared with recent fine-tuned sentiment analysis systems, our pipeline cuts memory demands by two orders of magnitude and eliminates task-specific training yet narrows the accuracy gap to under five percentage points. These findings indicate that a single frozen encoder, guided by handcrafted anchors and an ultra-light head, can deliver near-human triage quality across multiple pragmatic dimensions, opening the door to low-cost, language-agnostic monitoring of digital-service feedback. Full article
(This article belongs to the Special Issue Practical Applications of Sentiment Analysis)
Show Figures

Figure 1

18 pages, 640 KB  
Article
Fine-Tuning Methods and Dataset Structures for Multilingual Neural Machine Translation: A Kazakh–English–Russian Case Study in the IT Domain
by Zhanibek Kozhirbayev and Zhandos Yessenbayev
Electronics 2025, 14(15), 3126; https://doi.org/10.3390/electronics14153126 - 6 Aug 2025
Viewed by 396
Abstract
This study explores fine-tuning methods and dataset structures for multilingual neural machine translation using the No Language Left Behind model, with a case study on Kazakh, English, and Russian. We compare single-stage and two-stage fine-tuning approaches, as well as triplet versus non-triplet dataset [...] Read more.
This study explores fine-tuning methods and dataset structures for multilingual neural machine translation using the No Language Left Behind model, with a case study on Kazakh, English, and Russian. We compare single-stage and two-stage fine-tuning approaches, as well as triplet versus non-triplet dataset configurations, to improve translation quality. A high-quality, 50,000-triplet dataset in information technology domain, manually translated and expert-validated, serves as the in-domain benchmark, complemented by out-of-domain corpora like KazParC. Evaluations using BLEU, chrF, METEOR, and TER metrics reveal that single-stage fine-tuning excels for low-resource pairs (e.g., 0.48 BLEU, 0.77 chrF for Kazakh → Russian), while two-stage fine-tuning benefits high-resource pairs (Russian → English). Triplet datasets improve cross-linguistic consistency compared with non-triplet structures. Our reproducible framework offers practical guidance for adapting neural machine translation to technical domains and low-resource languages. Full article
Show Figures

Figure 1

18 pages, 263 KB  
Article
Assessing Quality of Life in Hemodialysis Patients in Kazakhstan: A Cross-Sectional Study
by Aruzhan Asanova, Aidos Bolatov, Deniza Suleimenova, Yelnur Khazhgaliyeva, Saule Shaisultanova, Sholpan Altynova and Yuriy Pya
J. Clin. Med. 2025, 14(14), 5021; https://doi.org/10.3390/jcm14145021 - 16 Jul 2025
Viewed by 349
Abstract
Background: The Kidney Disease and Quality of Life Short Form (KDQOL-SF™ 1.3) is widely used to assess health-related quality of life (HRQoL) in patients with end-stage renal disease. However, no prior validation had been conducted in Kazakhstan, where both Kazakh and Russian [...] Read more.
Background: The Kidney Disease and Quality of Life Short Form (KDQOL-SF™ 1.3) is widely used to assess health-related quality of life (HRQoL) in patients with end-stage renal disease. However, no prior validation had been conducted in Kazakhstan, where both Kazakh and Russian are commonly spoken. This study aimed to validate the Kazakh and Russian versions of the KDQOL-SF™ 1.3 and to identify predictors of HRQoL among hemodialysis patients in Kazakhstan. Methods: A cross-sectional survey was conducted among 217 adult hemodialysis patients from February to April 2025 using a mixed-methods approach (in-person interviews and online data collection). Psychometric testing included Cronbach’s alpha, floor and ceiling effect analysis, and Pearson correlations with self-rated overall health. Multiple linear regression was used to identify predictors of the Kidney Disease Component Summary (KDCS), Physical Component Summary (PCS), and Mental Component Summary (MCS) scores. Results: Both language versions demonstrated acceptable to excellent internal consistency (Cronbach’s α = 0.692–0.939). Most subscales were significantly correlated with self-rated health, supporting construct validity. Regression analyses revealed that greater satisfaction with care, better economic well-being, and more positive dialysis experiences were significant predictors of higher KDCS and MCS scores. Lower PCS scores were associated with female gender, comorbidities, and financial burden. Importantly, financial hardship and access challenges emerged as strong negative influences on HRQoL, underscoring the role of socioeconomic and care-related factors in patient well-being. Conclusions: The KDQOL-SF™ 1.3 is a valid and reliable tool for assessing quality of life among Kazakh- and Russian-speaking hemodialysis patients in Kazakhstan. Integrating this instrument into routine clinical practice may facilitate more personalized, patient-centered care and help monitor outcomes beyond traditional clinical indicators. Addressing economic and access-related barriers has the potential to significantly improve both physical and mental health outcomes in this vulnerable population. Full article
(This article belongs to the Section Nephrology & Urology)
20 pages, 283 KB  
Article
Integrating International Foodways and the Dominant Language Constellation Approach in Language Studies
by Alexandra Grigorieva and Ekaterina Protassova
Educ. Sci. 2025, 15(6), 765; https://doi.org/10.3390/educsci15060765 - 17 Jun 2025
Viewed by 655
Abstract
People in multilingual societies develop complex and interconnected food-making and food-discussing networks. On the basis of an experimental course titled “Food at Home, Food on the Move: Globalization and Regionalism in Modern Food Culture” taught at the University of Helsinki, we will show [...] Read more.
People in multilingual societies develop complex and interconnected food-making and food-discussing networks. On the basis of an experimental course titled “Food at Home, Food on the Move: Globalization and Regionalism in Modern Food Culture” taught at the University of Helsinki, we will show how the acquisition of culinary terminology puts forward the interconnectedness of languages and the dynamics between them in several sociolinguistic contexts. The lectures were grouped geographically: Eating with the Neighbors (Finnish cuisine and Swedish, Russian, Karelian and other influences); From the Baltic to Central Europe (Latvian, Lithuanian, Polish, German, and Hungarian food cultures); Formative Cuisines of the Mediterranean (French, Italian, Greek, Middle Eastern cuisine, etc.); and Eating Outside Europe (food culture influences from the US, Mexico, China, Japan, and India). The assignments included a critical lecture diary, an essay about eating experiences, or additional reading, a conversational analysis of a culinary show, or fieldwork in an ethnic restaurant. Raising awareness of linguistic and cultural diversity, motivating course participants to discuss the role and interaction of languages in their repertoire, makes them reflect on their multilingual identities. It allows educators to explore individuals’ DLCs in different contexts while navigating diverse global and local environments based on the principles of fairness and equality in education. Full article
(This article belongs to the Special Issue Innovation and Design in Multilingual Education)
17 pages, 879 KB  
Article
The Impact of Self-Sufficiency in Basic Raw Materials of Metallurgical Companies on Required Return and Capitalization: The Case of Russia
by Sergey Galevskiy, Tatyana Ponomarenko and Pavel Tsiglianu
J. Risk Financial Manag. 2025, 18(6), 318; https://doi.org/10.3390/jrfm18060318 - 10 Jun 2025
Viewed by 1448
Abstract
This article considers the impact of self-sufficiency in basic raw materials on the level of systematic risk, required return and capitalization on the example of Russian ferrous metallurgy companies. The methods applied include classical approaches to determining beta coefficient, required return and capitalization, [...] Read more.
This article considers the impact of self-sufficiency in basic raw materials on the level of systematic risk, required return and capitalization on the example of Russian ferrous metallurgy companies. The methods applied include classical approaches to determining beta coefficient, required return and capitalization, as well as correlation–regression analysis performed in the Python programming language (version 3.0, libraries: Numpy, Pandas, Matplotlib, Datetime, Statistics, Scipy, Bambi). The study revealed an inverse relationship between the self-sufficiency of ferrous metallurgy companies in iron ore and coking coal and their systematic risk. That was confirmed by the developed regression model. The presence of this dependence directly indicates the need to consider self-sufficiency when assessing a company’s required return and capitalization. The acquisition of the Tikhov coal mine by PJSC Magnitogorsk Iron and Steel Works (MMK) led to an increase in capitalization not only due to additional profit from the new asset, but also due to a decrease in the required return caused by the growth of the company’s self-sufficiency in coking coal. The proposed approach contributes to a more accurate assessment of the company’s capitalization and creates additional incentives for vertical integration transactions. Full article
(This article belongs to the Special Issue Corporate Finance: Financial Management of the Firm)
Show Figures

Figure 1

26 pages, 1323 KB  
Article
“Hands off Russian Schools”: How Do Online Media Portray the Linguistic Landscape of Protests Against Minority Education Reform in Latvia?
by Solvita Burr
Journal. Media 2025, 6(2), 84; https://doi.org/10.3390/journalmedia6020084 - 7 Jun 2025
Viewed by 1318
Abstract
Latvia after the collapse of the Soviet Union regained its independence in 1991. Since then, many political and social reforms have been introduced, minority education among them. Latvia began gradually abandoning the use of minority languages as mediums of instruction and switching to [...] Read more.
Latvia after the collapse of the Soviet Union regained its independence in 1991. Since then, many political and social reforms have been introduced, minority education among them. Latvia began gradually abandoning the use of minority languages as mediums of instruction and switching to teaching exclusively in Latvian as the sole state language. This caused protests by minority groups, especially by Russians—the largest minority group in Latvia. The article examines 77 online news articles by Latvian, Russian, and European media covering protests against minority education reform in Latvia between 2004 and 2024. Each news article used at least one photograph/video of placard(s) with written information from the protests. The aim of the article is to understand how different media represent the linguistic landscape of protests against minority education reform and what are the main discourses they create and maintain regarding to the linguistic landscape of such protests in Latvia. The description of the linguistic landscapes shows three main trends: (1) only journalists (most often anonymous) describe the written information expressed at the protests, (2) emphasis is on the number of placard holders at the protests, their age and affiliation with minority support organizations and political parties, (3) author(s) quote individual slogans, more often demonstrated from one protest to another, without disclosing in which language they were originally written and what problems (within and behind the language education) they highlight or conceal. The main narratives that are reinforced through the descriptions of the linguistic landscapes included in the articles are two: (1) the Russian community is united and persistent in the fight against the ethnolinguistically unjust education policy pursued by the government, and (2) students, parents, and the Russian community should have the right to choose which educational program to study at school. Full article
Show Figures

Figure 1

27 pages, 386 KB  
Article
Is Negation Negative? (And a Discussion of Negative Concord in SOV Languages)
by Paloma Jeretič
Languages 2025, 10(6), 130; https://doi.org/10.3390/languages10060130 - 3 Jun 2025
Viewed by 887
Abstract
Is negation negative? For some authors, in some languages, it is not. This is the case for so-called strict negative concord languages (e.g., Russian), in which negation is taken to be non-negative, following the cross-linguistic analysis for negative concord systems proposed by Hedde [...] Read more.
Is negation negative? For some authors, in some languages, it is not. This is the case for so-called strict negative concord languages (e.g., Russian), in which negation is taken to be non-negative, following the cross-linguistic analysis for negative concord systems proposed by Hedde Zeijlstra’s work “Sentential negation and negative concord”. However, this analysis is focused on languages with SVO word order. In this paper, I propose to reconsider the typology of negative concord by zooming out of the focus on SVO languages that current literature has relied on. I discuss the case of SOV languages where observing a strict NC pattern leads to weaker conclusions about the nature of negation than for SVO languages with strict negative concord, leaving the negativity status of negation in those languages underdetermined. I then take a look at Turkish, an SOV language with three sentential negation markers: plain sentential negation -mA, copular negation değil, and existential negation yok. Evidence from the interaction of these markers with neither..nor phrases suggests that değil and yok, in contrast with -mA, are non-negative for some speakers. In order to explain the variation, I put forward a hypothesis about the learning process, in which there is sometimes insufficient evidence in the input to determine whether değil and yok are negative, and learners choose between two conflicting heuristics that result in the negativity or non-negativity of these markers. Full article
(This article belongs to the Special Issue Theoretical Studies on Turkic Languages)
16 pages, 1091 KB  
Article
Transferring Natural Language Datasets Between Languages Using Large Language Models for Modern Decision Support and Sci-Tech Analytical Systems
by Dmitrii Popov, Egor Terentev, Danil Serenko, Ilya Sochenkov and Igor Buyanov
Big Data Cogn. Comput. 2025, 9(5), 116; https://doi.org/10.3390/bdcc9050116 - 28 Apr 2025
Viewed by 781
Abstract
The decision-making process to rule R&D relies on information related to current trends in particular research areas. In this work, we investigated how one can use large language models (LLMs) to transfer the dataset and its annotation from one language to another. This [...] Read more.
The decision-making process to rule R&D relies on information related to current trends in particular research areas. In this work, we investigated how one can use large language models (LLMs) to transfer the dataset and its annotation from one language to another. This is crucial since sharing knowledge between different languages could boost certain underresourced directions in the target language, saving lots of effort in data annotation or quick prototyping. We experiment with English and Russian pairs, translating the DEFT (Definition Extraction from Texts) corpus. This corpus contains three layers of annotation dedicated to term-definition pair mining, which is a rare annotation type for Russian. The presence of such a dataset is beneficial for the natural language processing methods of trend analysis in science since the terms and definitions are the basic blocks of any scientific field. We provide a pipeline for the annotation transfer using LLMs. In the end, we train the BERT-based models on the translated dataset to establish a baseline. Full article
Show Figures

Figure 1

21 pages, 4875 KB  
Article
Using Large Language Models for Goal-Oriented Dialogue Systems
by Leonid Legashev, Alexander Shukhman, Vadim Badikov and Vladislav Kurynov
Appl. Sci. 2025, 15(9), 4687; https://doi.org/10.3390/app15094687 - 23 Apr 2025
Viewed by 2366
Abstract
In the development of goal-oriented dialogue systems, neural network topic modeling and clustering methods are traditionally used to extract user intentions and operator response scenario blocks. The emergence of generative large language models allows one to radically change the approach to generate dialogue [...] Read more.
In the development of goal-oriented dialogue systems, neural network topic modeling and clustering methods are traditionally used to extract user intentions and operator response scenario blocks. The emergence of generative large language models allows one to radically change the approach to generate dialogue scenarios in the form of a graph with context preservation. In this article we analyzed seven popular large language models on prepared test prompts for Russian and English languages for intent mining and named entity recognition. The present study aimed to investigate the effectiveness of two methods for constructing dialogues in goal-oriented dialogue systems: the heuristic-based approach with additional training on labeled data and the prompt-based approach without such training. The primary objective was to evaluate the impact of incorporating labeled dialogue data on the quality of constructed dialogues, with a focus on dialogue context. The study emphasized the need for dialogue systems to consider the dialogue context in constructing goal-oriented dialogues. The two approaches were compared for the MultiWOZ 2.2 and MANTiS dialogue corpora on a locally deployed LLaMA model. The results showed that the LlaMA model without training on labeled dialogues achieved a BERTScore metric value of 0.75 for the MultiWOZ dataset and 0.72 for the MANTiS dataset, and the LlaMA model with training on labeled dialogues achieved a BERTScore metric value of 0.85 for the MultiWOZ dataset and 0.82 for the MANTiS dataset. This finding has practical implications for the development of more effective dialogue systems in the field of customer service that can engage users in more productive and meaningful machine-to-human interactions. Full article
Show Figures

Figure 1

17 pages, 1841 KB  
Article
Monitoring of Sustainable Development Trends: Text Mining in Regional Media
by Galina Chernyshova, Evgeniy Taran, Anna Firsova and Alla Vavilina
Sustainability 2025, 17(7), 3122; https://doi.org/10.3390/su17073122 - 1 Apr 2025
Cited by 1 | Viewed by 828
Abstract
The monitoring of regional development sustainability is closely linked to the development of an indicator system that best meets stakeholders’ requirements, providing a solid foundation for strategic decision-making. In pursuit of progress in achieving the Sustainable Development Goals (SDG), efforts are continuously being [...] Read more.
The monitoring of regional development sustainability is closely linked to the development of an indicator system that best meets stakeholders’ requirements, providing a solid foundation for strategic decision-making. In pursuit of progress in achieving the Sustainable Development Goals (SDG), efforts are continuously being undertaken to refine and enhance the indicator framework. Implementing interdisciplinary approaches for a comprehensive assessment of sustainable development in regions allows for a swift expansion and augmentation of data on regional transformations. An important aspect of the study of sustainability at the regional level is the additional possibility of using unstructured news content through text mining methods. The issue of applying natural language processing techniques for Russian-language sources is significant, as a large number of relevant tools are developed for English. Additionally, the analysis of news content has several features that complicate the classification of sentiments of messages with mostly neutral wording. The proposed methodology for processing specific news content in assessing the sustainability of regional development was implemented. An application for data scraping was developed, data were collected taking into account the selected regions and periods, stop word dictionaries were configured, frequency analysis was implemented, and the sentiment analysis of the obtained slices was carried out. For the formed set of news documents related to sustainable development by keywords according to SDGs 1–17, for the regions of the Volga Federal District, a corpus of documents was obtained representing data for 2021, 2022, and 2023 for 14 regions. The analysis of key topics for different areas and periods was carried out using the cosine similarity measure. The developed approach to news analysis allows for increasing the efficiency of monitoring on various topics. This methodology has been tested for systemic and operational assessment in the dynamics of the sustainable development of regions. Text analysis methods within the framework of decision support at the regional level provide the opportunity to identify emerging trends. Full article
(This article belongs to the Section Development Goals towards Sustainability)
Show Figures

Figure 1

17 pages, 840 KB  
Article
Enhancing Green Practice Detection in Social Media with Paraphrasing-Based Data Augmentation
by Anna Glazkova and Olga Zakharova
Big Data Cogn. Comput. 2025, 9(4), 81; https://doi.org/10.3390/bdcc9040081 - 31 Mar 2025
Viewed by 471
Abstract
Detecting mentions of green waste practices on social networks is a crucial tool for environmental monitoring and sustainability analytics. Social media serve as a valuable source of ecological information, enabling researchers to track trends, assess public engagement, and predict the spread of sustainable [...] Read more.
Detecting mentions of green waste practices on social networks is a crucial tool for environmental monitoring and sustainability analytics. Social media serve as a valuable source of ecological information, enabling researchers to track trends, assess public engagement, and predict the spread of sustainable behaviors. Automatic extraction of mentions of green waste practices facilitates large-scale analysis, but the uneven distribution of such mentions presents a challenge for effective detection. To address this, data augmentation plays a key role in balancing class distribution in green practice detection tasks. In this study, we compared existing data augmentation techniques based on the paraphrasing of original texts. We evaluated the effectiveness of additional explanations in prompts, the Chain-of-Thought prompting, synonym substitution, and text expansion. Experiments were conducted on the GreenRu dataset, which focuses on detecting mentions of green waste practices in Russian social media. Our results, obtained using two instruction-based large language models, demonstrated the effectiveness of the Chain-of-Thought prompting for text augmentation. These findings contribute to advancing sustainability analytics by improving automated detection and analysis of environmental discussions. Furthermore, the results of this study can be applied to other tasks that require augmentation of text data in the context of ecological research and beyond. Full article
Show Figures

Figure 1

Back to TopTop