Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,179)

Search Parameters:
Keywords = speech analysis

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
39 pages, 1188 KB  
Review
A Scoping Review of AI-Based Approaches for Detecting Autism Traits Using Voice and Behavioral Data
by Hajarimino Rakotomanana and Ghazal Rouhafzay
Bioengineering 2025, 12(11), 1136; https://doi.org/10.3390/bioengineering12111136 - 22 Oct 2025
Abstract
This scoping review systematically maps the rapidly evolving application of Artificial Intelligence (AI) in Autism Spectrum Disorder (ASD) diagnostics, specifically focusing on computational behavioral phenotyping. Recognizing that observable traits like speech and movement are critical for early, timely intervention, the study synthesizes AI’s [...] Read more.
This scoping review systematically maps the rapidly evolving application of Artificial Intelligence (AI) in Autism Spectrum Disorder (ASD) diagnostics, specifically focusing on computational behavioral phenotyping. Recognizing that observable traits like speech and movement are critical for early, timely intervention, the study synthesizes AI’s use across eight key behavioral modalities. These include voice biomarkers, conversational dynamics, linguistic analysis, movement analysis, activity recognition, facial gestures, visual attention, and multimodal approaches. The review analyzed 158 studies published between 2015 and 2025, revealing that modern Machine Learning and Deep Learning techniques demonstrate highly promising diagnostic performance in controlled environments, with reported accuracies of up to 99%. Despite this significant capability, the review identifies critical challenges that impede clinical implementation and generalizability. These persistent limitations include pervasive issues with dataset heterogeneity, gender bias in samples, and small overall sample sizes. By detailing the current landscape of observable data types, computational methodologies, and available datasets, this work establishes a comprehensive overview of AI’s current strengths and fundamental weaknesses in ASD diagnosis. The article concludes by providing actionable recommendations aimed at guiding future research toward developing diagnostic solutions that are more inclusive, generalizable, and ultimately applicable in clinical settings. Full article
Show Figures

Figure 1

14 pages, 589 KB  
Article
The Diagnostic and Prognostic Value of Reticulated Platelets in Ischemic Stroke: Is Immature Platelet Fraction a New Biomarker?
by Fatih Cemal Tekin, Osman Lütfi Demirci, Emin Fatih Vişneci, Abdullah Enes Ataş, Hasan Hüseyin Kır, Hasan Basri Yıldırım, Çiğdem Damla Deniz, Demet Acar, Said Sami Erdem and Mehmet Gül
Medicina 2025, 61(10), 1887; https://doi.org/10.3390/medicina61101887 - 21 Oct 2025
Viewed by 126
Abstract
Background and Objectives: Ongoing efforts to develop early diagnostic tools for Acute Ischemic Stroke (AIS) point out the advantages of accessible biomarkers such as Immature Platelet Fraction (IPF). This is particularly important for emergency department (EDs), especially those that are overcrowded and [...] Read more.
Background and Objectives: Ongoing efforts to develop early diagnostic tools for Acute Ischemic Stroke (AIS) point out the advantages of accessible biomarkers such as Immature Platelet Fraction (IPF). This is particularly important for emergency department (EDs), especially those that are overcrowded and have limited resources. The present study aimed to evaluate the diagnostic, prognostic, and therapeutic significance of IPF in patients with AIS presenting to the ED. Materials and Methods: This prospective case–control study was conducted in an ED. Participants aged 18-years and older who presented with complaints of numbness, weakness, diplopia or visual disturbances, speech or comprehension impairment, confusion, imbalance, impaired coordination and gait, or dizziness were included in the study. The diagnostic value of IPF in AIS and its relationship with short-term prognosis (STP) were investigated. Additional variables potentially associated with parameters such as infarct localization, number of lesions, affected hemisphere, main artery status, carotid status and treatment method were also analyzed. Results: The median age of the study participants was 67 years (Q1 = 54, Q3 = 76), with 48.9% (n = 88) being female and 51.1% (n = 92) male. Receiver operating characteristic curve analysis demonstrated that IPF was statistically significantly superior to other complete blood count parameters in the diagnostic evaluation of AIS. The diagnostic cutoff value of IPF for AIS was calculated as 2.45. An increase of 1 unit in IPF was found to raise the likelihood of AIS by 2.599 times. The Ratio of Red Cell Distribution Width (RDW) to IPF and NEU to IPF, mean corpuscular volume, and infarct volume were found to be significant predictors in STP assessment. Conclusions: Although not definitive alone, IPF may aid early stroke recognition, support treatment monitoring, and inform targeted therapies. The use of IPF, a biomarker that can be rapidly obtained, in the diagnosis of AIS is expected to yield beneficial outcomes in patient management, particularly in emergency departments and other clinical settings. Full article
(This article belongs to the Special Issue New Insights into Cerebrovascular Disease)
Show Figures

Figure 1

17 pages, 1055 KB  
Article
Testing a New Approach to Monitor Mild Cognitive Impairment and Cognition in Older Adults at the Community Level
by Isabel Paniak, Ethan Cohen, Christa Studzinski and Lia Tsotsos
Multimodal Technol. Interact. 2025, 9(10), 109; https://doi.org/10.3390/mti9100109 - 21 Oct 2025
Viewed by 180
Abstract
Dementia and mild cognitive impairment (MCI) are growing health concerns in Canada’s aging population. Over 700,000 Canadians currently live with dementia, and this number is expected to rise. As the older adult population increases, coupled with an already strained healthcare system, there is [...] Read more.
Dementia and mild cognitive impairment (MCI) are growing health concerns in Canada’s aging population. Over 700,000 Canadians currently live with dementia, and this number is expected to rise. As the older adult population increases, coupled with an already strained healthcare system, there is a pressing need for innovative tools that support aging in place. This study explored the feasibility and acceptability of using a Digital Human (DH) conversational agent, combined with AI-driven speech analysis, to monitor cognitive function, anxiety, and depression in cognitively healthy community-dwelling older adults (CDOA) aged 65 and older. Sixty older adults participated in up to three in-person sessions over six months, interacting with the DH through journaling and picture description tasks. Afterward, 51 of the participants completed structured interviews about their experiences and perceptions of the DH and AI more generally. Findings showed that 84% enjoyed interacting with the DH, and 96% expressed interest in learning more about AI in healthcare. While participants were open and curious about AI, 67% voiced concerns about AI replacing human interaction in healthcare. Most found the DH friendly, though reactions to its appearance varied. Overall, participants viewed AI as a promising tool, provided it complements, rather than replaces, human interactions. Full article
Show Figures

Figure 1

17 pages, 4937 KB  
Perspective
Unraveling Stuttering Through a Multi-Omics Lens
by Deyvid Novaes Marques
Life 2025, 15(10), 1630; https://doi.org/10.3390/life15101630 - 19 Oct 2025
Viewed by 247
Abstract
Stuttering, a complex and multifactorial speech disorder, has long presented an enigma regarding its etiology. While earlier approaches often emphasized psychosocial influences, historical clinical and speech-language strategies have considered multiple contributing factors. By integrating genomic, transcriptomic and phenomic evidence, the ongoing research illustrates [...] Read more.
Stuttering, a complex and multifactorial speech disorder, has long presented an enigma regarding its etiology. While earlier approaches often emphasized psychosocial influences, historical clinical and speech-language strategies have considered multiple contributing factors. By integrating genomic, transcriptomic and phenomic evidence, the ongoing research illustrates how functional genomics can unravel the biological architecture of complex speech disorders. In particular, advances in omic technologies have unequivocally positioned genetics and underlying biological pathways at the forefront of stuttering research. I have experienced stuttering and lived with it since my early childhood. This perspective article presents findings from omic studies, highlighting relevant aspects such as gene discoveries, implicated cellular mechanisms, and the intricate genetic architecture of developmental stuttering. As a person who stutters, I offer an intimate perspective on how these scientific insights are not merely academic but profoundly impactful for the affected community. A multi-omic integration strategy, combining large-scale genetic discovery with deep phenotyping and functional validation, is advocated to accelerate understanding in this field. Additionally, a bibliometric analysis using an international database was conducted to map trends and identify directions in stuttering research within the omic context. Ultimately, these scientific endeavors hold the potential to inform not only personalized interventions but also critical policy and regulatory changes, enhancing accessibility, support, and the recognized rights of people who stutter. Full article
(This article belongs to the Special Issue Recent Advances in Functional Genomics)
Show Figures

Figure 1

23 pages, 5774 KB  
Article
A Multimodal Voice Phishing Detection System Integrating Text and Audio Analysis
by Jiwon Kim, Seuli Gu, Youngbeom Kim, Sukwon Lee and Changgu Kang
Appl. Sci. 2025, 15(20), 11170; https://doi.org/10.3390/app152011170 - 18 Oct 2025
Viewed by 197
Abstract
Voice phishing has emerged as a critical security threat, exploiting both linguistic manipulation and advances in synthetic speech technologies. Traditional keyword-based approaches often fail to capture contextual patterns or detect forged audio, limiting their effectiveness in real-world scenarios. To address this gap, we [...] Read more.
Voice phishing has emerged as a critical security threat, exploiting both linguistic manipulation and advances in synthetic speech technologies. Traditional keyword-based approaches often fail to capture contextual patterns or detect forged audio, limiting their effectiveness in real-world scenarios. To address this gap, we propose a multimodal voice phishing detection system that integrates text and audio analysis. The text module employs a KoBERT-based transformer classifier with self-attention interpretation, while the audio module leverages MFCC features and a CNN–BiLSTM classifier to identify synthetic speech. A fusion mechanism combines the outputs of both modalities, with experiments conducted on real-world call transcripts, phishing datasets, and synthetic voice corpora. The results demonstrate that the proposed system consistently achieves high values regarding the accuracy, precision, recall, and F1-score on validation data while maintaining robust performance in noisy and diverse real-call scenarios. Furthermore, attention-based interpretability enhances trustworthiness by revealing cross-token and discourse-level interaction patterns specific to phishing contexts. These findings highlight the potential of the proposed system as a reliable, explainable, and deployable solution for preventing the financial and social damage caused by voice phishing. Unlike prior studies limited to single-modality or shallow fusion, our work presents a fully integrated text–audio detection pipeline optimized for Korean real-world datasets and robust to noisy, multi-speaker conditions. Full article
Show Figures

Figure 1

29 pages, 747 KB  
Systematic Review
Hate Speech on Social Media: A Systemic Narrative Review of Political Science Contributions
by Cigdem Kentmen-Cin
Soc. Sci. 2025, 14(10), 610; https://doi.org/10.3390/socsci14100610 - 15 Oct 2025
Viewed by 831
Abstract
Cross-national public opinion surveys show that a significant majority of young people are frequently exposed to hateful content on social media, which suggest the need to better understand its political implications. This systematic narrative literature review addresses three key questions: (1) Which factors [...] Read more.
Cross-national public opinion surveys show that a significant majority of young people are frequently exposed to hateful content on social media, which suggest the need to better understand its political implications. This systematic narrative literature review addresses three key questions: (1) Which factors have been explored in political science as the main drivers of hate speech on social media? (2) What do empirical studies in political science suggest about the political consequences of online hate speech? (3) What strategies have been proposed within the political science literature to address and counteract these dynamics? Based on an analysis of 79 research articles published in the field of political science and international relations retrieved from the Web of Science Core Collection, this review found that online hate is linked to social media platform policies, national and international regulatory frameworks, perceived threats to in-group identity, far-right and populist rhetoric, politically significant events such as elections, the narratives of traditional media, the post-truth environment, and historical animosities. The literature shows that hate speech normalizes discriminatory behavior, silences opposing voices, and mobilizes organized hate. In response, political science research underscores the importance of online deterrence mechanisms, counter-speech, allyship, and digital literacy as strategies to combat hate during the social media era. Full article
(This article belongs to the Section International Politics and Relations)
Show Figures

Figure 1

58 pages, 744 KB  
Article
Review and Comparative Analysis of Databases for Speech Emotion Recognition
by Salvatore Serrano, Omar Serghini, Giulia Esposito, Silvia Carbone, Carmela Mento, Alessandro Floris, Simone Porcu and Luigi Atzori
Data 2025, 10(10), 164; https://doi.org/10.3390/data10100164 - 14 Oct 2025
Viewed by 627
Abstract
Speech emotion recognition (SER) has become increasingly important in areas such as healthcare, customer service, robotics, and human–computer interaction. The progress of this field depends not only on advances in algorithms but also on the databases that provide the training material for SER [...] Read more.
Speech emotion recognition (SER) has become increasingly important in areas such as healthcare, customer service, robotics, and human–computer interaction. The progress of this field depends not only on advances in algorithms but also on the databases that provide the training material for SER systems. These resources set the boundaries for how well models can generalize across speakers, contexts, and cultures. In this paper, we present a narrative review and comparative analysis of emotional speech corpora released up to mid-2025, bringing together both psychological and technical perspectives. Rather than following a systematic review protocol, our approach focuses on providing a critical synthesis of more than fifty corpora covering acted, elicited, and natural speech. We examine how these databases were collected, how emotions were annotated, their demographic diversity, and their ecological validity, while also acknowledging the limits of available documentation. Beyond description, we identify recurring strengths and weaknesses, highlight emerging gaps, and discuss recent usage patterns to offer researchers both a practical guide for dataset selection and a critical perspective on how corpus design continues to shape the development of robust and generalizable SER systems. Full article
Show Figures

Figure 1

29 pages, 1708 KB  
Article
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
by Aidana Karibayeva, Vladislav Karyukin, Balzhan Abduali and Dina Amirova
Information 2025, 16(10), 879; https://doi.org/10.3390/info16100879 - 10 Oct 2025
Viewed by 597
Abstract
With the rapid development of artificial intelligence and machine learning technologies, automatic speech recognition (ASR) and text-to-speech (TTS) have become key components of the digital transformation of society. The Kazakh language, as a representative of the Turkic language family, remains a low-resource language [...] Read more.
With the rapid development of artificial intelligence and machine learning technologies, automatic speech recognition (ASR) and text-to-speech (TTS) have become key components of the digital transformation of society. The Kazakh language, as a representative of the Turkic language family, remains a low-resource language with limited audio corpora, language models, and high-quality speech synthesis systems. This study provides a comprehensive analysis of existing speech recognition and synthesis models, emphasizing their applicability and adaptation to the Kazakh language. Special attention is given to linguistic and technical barriers, including the agglutinative structure, rich vowel system, and phonemic variability. Both open-source and commercial solutions were evaluated, including Whisper, GPT-4 Transcribe, ElevenLabs, OpenAI TTS, Voiser, KazakhTTS2, and TurkicTTS. Speech recognition systems were assessed using BLEU, WER, TER, chrF, and COMET, while speech synthesis was evaluated with MCD, PESQ, STOI, and DNSMOS, thus covering both lexical–semantic and acoustic–perceptual characteristics. The results demonstrate that, for speech-to-text (STT), the strongest performance was achieved by Soyle on domain-specific data (BLEU 74.93, WER 18.61), while Voiser showed balanced accuracy (WER 40.65–37.11, chrF 80.88–84.51) and GPT-4 Transcribe achieved robust semantic preservation (COMET up to 1.02). In contrast, Whisper performed weakest (WER 77.10, BLEU 13.22), requiring further adaptation for Kazakh. For text-to-speech (TTS), KazakhTTS2 delivered the most natural perceptual quality (DNSMOS 8.79–8.96), while OpenAI TTS achieved the best spectral accuracy (MCD 123.44–117.11, PESQ 1.14). TurkicTTS offered reliable intelligibility (STOI 0.15, PESQ 1.16), and ElevenLabs produced natural but less spectrally accurate speech. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

16 pages, 238 KB  
Article
Anti-Bullying in the Digital Age: How Cyberhate Travels from Social Media to Classroom Climate in Pre-Service Teacher Programmes
by Jesús Marolla-Gajardo and María Yazmina Lozano Mas
Societies 2025, 15(10), 284; https://doi.org/10.3390/soc15100284 - 10 Oct 2025
Viewed by 309
Abstract
This article examines online hate as a driver of cyberbullying and a barrier to inclusive schooling, integrating theoretical, philosophical and methodological perspectives. We approach hate speech as communicative practices that legitimise discrimination and exclusion and, once amplified by social media affordances, erode equity, [...] Read more.
This article examines online hate as a driver of cyberbullying and a barrier to inclusive schooling, integrating theoretical, philosophical and methodological perspectives. We approach hate speech as communicative practices that legitimise discrimination and exclusion and, once amplified by social media affordances, erode equity, belonging and well-being in educational settings. The study adopts a qualitative, exploratory–descriptive design using focus groups with pre-service teachers from initial teacher education programmes across several Chilean regions. Participants reflected on the presence, trajectories and classroom effects of cyberhate/cyberbullying. Data were analysed thematically with ATLAS.ti24. Findings describe a recurrent pathway in which anonymous posts lead to public exposure, followed by heightened anxiety and eventual withdrawal. This shows how online aggression spills into classrooms, normalises everyday disparagement and fuels self-censorship, especially among minoritised students. The analysis also highlights the amplifying role of educator authority (tone, feedback, modelling) and institutional inaction. In response, participants identified protective practices: explicit dialogic norms, rapid and caring classroom interventions, restorative and care-centred feedback, partnership with families and peers, and critical digital citizenship that links platform literacy with ethical reasoning. The article contributes evidence to inform anti-bullying policy, inclusive curriculums and teacher education by proposing actionable, context-sensitive strategies that strengthen equity, dignity and belonging. Full article
(This article belongs to the Special Issue Anti-Bullying in the Digital Age: Evidences and Emerging Trends)
19 pages, 1648 KB  
Article
Modality-Enhanced Multimodal Integrated Fusion Attention Model for Sentiment Analysis
by Zhenwei Zhang, Wenyan Wu, Tao Yuan and Guang Feng
Appl. Sci. 2025, 15(19), 10825; https://doi.org/10.3390/app151910825 - 9 Oct 2025
Viewed by 897
Abstract
Multimodal sentiment analysis aims to utilize multisource information such as text, speech and vision to more comprehensively and accurately identify an individual’s emotional state. However, existing methods still face challenges in practical applications, including modality heterogeneity, insufficient expressive power of non-verbal modalities, and [...] Read more.
Multimodal sentiment analysis aims to utilize multisource information such as text, speech and vision to more comprehensively and accurately identify an individual’s emotional state. However, existing methods still face challenges in practical applications, including modality heterogeneity, insufficient expressive power of non-verbal modalities, and low fusion efficiency. To address these issues, this paper proposes a Modality Enhanced Multimodal Integration Model (MEMMI). First, a modality enhancement module is designed to leverage the semantic guidance capability of the text modality, enhancing the feature representation of non-verbal modalities through a multihead attention mechanism and a dynamic routing strategy. Second, a gated fusion mechanism is introduced to selectively inject speech and visual information into the dominant text modality, enabling robust information completion and noise suppression. Finally, a combined attention fusion module is constructed to synchronously fuse information from all three modalities within a unified architecture, hile a multiscale encoder is used to capture feature representations at different semantic levels. Experimental results on three benchmark datasets—CMU-MOSEI, CMU-MOSI, and CH-SIMS—demonstrate the superiority of the proposed model. On CMU-MOSI, it achieves an Acc-7 of 45.91, with binary accuracy/F1 of 82.86/84.60, MAE of 0.734, and Corr of 0.790, outperforming TFN and MulT by a large margin. On CMU-MOSEI, the model reaches an Acc-7 of 54.17, Acc-2/F1 of 83.69/86.02, MAE of 0.526, and Corr of 0.779, surpassing all baselines, including ALMT. On CH-SIMS, it further achieves 41.88, 66.52, and 77.68 in Acc-5/Acc-3/Acc-2, with F1 of 77.85, MAE of 0.450, and Corr of 0.594, establishing new state-of-the-art performance across datasets. These results confirm that MEMMI achieves state-of-the-art performance across multiple metrics. Furthermore, ablation studies validate the effectiveness of each module in enhancing modality representation and fusion efficiency. Full article
Show Figures

Figure 1

23 pages, 1934 KB  
Article
INTU-AI: Digitalization of Police Interrogation Supported by Artificial Intelligence
by José Pinto Garcia, Carlos Grilo, Patrício Domingues and Rolando Miragaia
Appl. Sci. 2025, 15(19), 10781; https://doi.org/10.3390/app151910781 - 7 Oct 2025
Viewed by 475
Abstract
Traditional police interrogation processes remain largely time-consuming and reliant on substantial human effort for both analysis and documentation. Intuition Artificial Intelligence (INTU-AI) is a Windows application designed to digitalize the administrative workflow associated with police interrogations, while enhancing procedural efficiency through the integration [...] Read more.
Traditional police interrogation processes remain largely time-consuming and reliant on substantial human effort for both analysis and documentation. Intuition Artificial Intelligence (INTU-AI) is a Windows application designed to digitalize the administrative workflow associated with police interrogations, while enhancing procedural efficiency through the integration of AI-driven emotion recognition models. The system employs a multimodal approach that captures and analyzes emotional states using three primary vectors: Facial Expression Recognition (FER), Speech Emotion Recognition (SER), and Text-based Emotion Analysis (TEA). This triangulated methodology aims to identify emotional inconsistencies and detect potential suppression or concealment of affective responses by interviewees. INTU-AI serves as a decision-support tool rather than a replacement for human judgment. By automating bureaucratic tasks, it allows investigators to focus on critical aspects of the interrogation process. The system was validated in practical training sessions with inspectors and with a 12-question questionnaire. The results indicate a strong acceptance of the system in terms of its usability, existing functionalities, practical utility of the program, user experience, and open-ended qualitative responses. Full article
(This article belongs to the Special Issue Digital Transformation in Information Systems)
Show Figures

Figure 1

19 pages, 7222 KB  
Article
Multi-Channel Spectro-Temporal Representations for Speech-Based Parkinson’s Disease Detection
by Hadi Sedigh Malekroodi, Nuwan Madusanka, Byeong-il Lee and Myunggi Yi
J. Imaging 2025, 11(10), 341; https://doi.org/10.3390/jimaging11100341 - 1 Oct 2025
Viewed by 293
Abstract
Early, non-invasive detection of Parkinson’s Disease (PD) using speech analysis offers promise for scalable screening. In this work, we propose a multi-channel spectro-temporal deep-learning approach for PD detection from sentence-level speech, a clinically relevant yet underexplored modality. We extract and fuse three complementary [...] Read more.
Early, non-invasive detection of Parkinson’s Disease (PD) using speech analysis offers promise for scalable screening. In this work, we propose a multi-channel spectro-temporal deep-learning approach for PD detection from sentence-level speech, a clinically relevant yet underexplored modality. We extract and fuse three complementary time–frequency representations—mel spectrogram, constant-Q transform (CQT), and gammatone spectrogram—into a three-channel input analogous to an RGB image. This fused representation is evaluated across CNNs (ResNet, DenseNet, and EfficientNet) and Vision Transformer using the PC-GITA dataset, under 10-fold subject-independent cross-validation for robust assessment. Results showed that fusion consistently improves performance over single representations across architectures. EfficientNet-B2 achieves the highest accuracy (84.39% ± 5.19%) and F1-score (84.35% ± 5.52%), outperforming recent methods using handcrafted features or pretrained models (e.g., Wav2Vec2.0, HuBERT) on the same task and dataset. Performance varies with sentence type, with emotionally salient and prosodically emphasized utterances yielding higher AUC, suggesting that richer prosody enhances discriminability. Our findings indicate that multi-channel fusion enhances sensitivity to subtle speech impairments in PD by integrating complementary spectral information. Our approach implies that multi-channel fusion could enhance the detection of discriminative acoustic biomarkers, potentially offering a more robust and effective framework for speech-based PD screening, though further validation is needed before clinical application. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

20 pages, 726 KB  
Article
Suržyk as a Transitional Stage from Russian to Ukrainian: The Perspective of Ukrainian Migrants and War Refugees in Finland
by Yan Kapranov, Anna Verschik, Liisa-Maria Lehto and Maria Frick
Languages 2025, 10(10), 254; https://doi.org/10.3390/languages10100254 - 30 Sep 2025
Viewed by 385
Abstract
This article examines how Ukrainian migrants and war refugees in Finland perceive and use Suržyk, a cluster of intermediate varieties between Ukrainian and Russian, as a transitional stage facilitating the shift from Russian-dominant to Ukrainian-dominant speech. Drawing on 1615 survey responses collected between [...] Read more.
This article examines how Ukrainian migrants and war refugees in Finland perceive and use Suržyk, a cluster of intermediate varieties between Ukrainian and Russian, as a transitional stage facilitating the shift from Russian-dominant to Ukrainian-dominant speech. Drawing on 1615 survey responses collected between November 2022 and January 2023, the study reveals that 42 respondents view Suržyk as a bridge that supports the gradual acquisition of standard Ukrainian. Qualitative content analysis of open-ended responses shows repeated references to Suržyk as a “stepping stone”, “temporary means” or “bridge”, highlighting its role in maintaining intelligibility and fluency for speakers who are not confident in standard Ukrainian. Although some respondents acknowledge the stigma associated with mixed speech, they also stress Suržyk’s practical advantages in contexts shaped by the 2022 full-scale war and heightened purist discourses. Speakers report pressure to adhere to purist language norms in formal settings, whereas in informal spaces, they consider Suržyk a natural outcome of bilingual backgrounds. These findings illuminate the interplay between language ideologies, sociopolitical dynamics, and individual agency, suggesting that for many Ukrainians in Finland, Suržyk serves as a temporary yet functional means to align with Ukrainian identity under rapidly changing circumstances. Full article
(This article belongs to the Special Issue Language Attitudes and Language Ideologies in Eastern Europe)
29 pages, 2068 KB  
Article
Voice-Based Early Diagnosis of Parkinson’s Disease Using Spectrogram Features and AI Models
by Danish Quamar, V. D. Ambeth Kumar, Muhammad Rizwan, Ovidiu Bagdasar and Manuella Kadar
Bioengineering 2025, 12(10), 1052; https://doi.org/10.3390/bioengineering12101052 - 29 Sep 2025
Viewed by 652
Abstract
Parkinson’s disease (PD) is a progressive neurodegenerative disorder that significantly affects motor functions, including speech production. Voice analysis offers a less invasive, faster and more cost-effective approach for diagnosing and monitoring PD over time. This research introduces an automated system to distinguish between [...] Read more.
Parkinson’s disease (PD) is a progressive neurodegenerative disorder that significantly affects motor functions, including speech production. Voice analysis offers a less invasive, faster and more cost-effective approach for diagnosing and monitoring PD over time. This research introduces an automated system to distinguish between PD and non-PD individuals based on speech signals using state-of-the-art signal processing and machine learning (ML) methods. A publicly available voice dataset (Dataset 1, 81 samples) containing speech recordings from PD patients and non-PD individuals was used for model training and evaluation. Additionally, a small supplementary dataset (Dataset 2, 15 samples) was created although excluded from experiment, to illustrate potential future extensions of this work. Features such as Mel-frequency cepstral coefficients (MFCCs), spectrograms, Mel spectrograms and waveform representations were extracted to capture key vocal impairments related to PD, including diminished vocal range, weak harmonics, elevated spectral entropy and impaired formant structures. These extracted features were used to train and evaluate several ML models, including support vector machine (SVM), XGBoost and logistic regression, as well as deep learning (DL)architectures such as deep neural networks (DNN), convolutional neural networks (CNN) combined with long short-term memory (LSTM), CNN + gated recurrent unit (GRU) and bidirectional LSTM (BiLSTM). Experimental results show that DL models, particularly BiLSTM, outperform traditional ML models, achieving 97% accuracy and an AUC of 0.95. The comprehensive feature extraction from both datasets enabled robust classification of PD and non-PD speech signals. These findings highlight the potential of integrating acoustic features with DL methods for early diagnosis and monitoring of Parkinson’s Disease. Full article
Show Figures

Figure 1

14 pages, 378 KB  
Article
Exploring Language Recovery Pattern in Persons with Aphasia Across Acute and Sub-Acute Stages
by Deepak Puttanna, Nova Maria Saji, Mohammed F. ALHarbi, Akshaya Swamy and Darshan Hosaholalu Sarvajna
Behav. Sci. 2025, 15(10), 1339; https://doi.org/10.3390/bs15101339 - 29 Sep 2025
Viewed by 411
Abstract
Recovery from aphasia is a complex process involving restoring language ability to a level comparable to an individual’s pre-aphasia state. This recovery extends beyond linguistic functions such as improved quality of life and functional communication. Understanding language recovery in PWAs is a key [...] Read more.
Recovery from aphasia is a complex process involving restoring language ability to a level comparable to an individual’s pre-aphasia state. This recovery extends beyond linguistic functions such as improved quality of life and functional communication. Understanding language recovery in PWAs is a key area in aphasia research. Thus, the current study aimed to understand the pattern of language recovery in the acute and sub-acute stages of persons with aphasia (PWAs). A total of 11 PWAs aged between 40 and 80 were recruited. The study was conducted in two phases. In the acute stage (within one week post-stroke), participants were assessed using the Western Aphasia Battery-Kannada (WAB-K). In the sub-acute stage (between seven and fifteen days post-stroke), a similar test battery was repeated. The findings of the study showed auditory verbal comprehension scores were pronounced in the acute and sub-acute stages of recovery. Further, language quotient (LQ) scores were higher in the sub-acute stage compared to the acute stage, though these differences failed to evince statistical differences. Correlation analysis revealed strong positive correlations between LQ and spontaneous speech, repetition, and naming, with moderate correlations for auditory verbal comprehension. The study’s findings highlight the importance of targeted therapeutic interventions for PWAs, emphasizing an early focus on auditory verbal comprehension to enhance overall language recovery. Full article
(This article belongs to the Section Experimental and Clinical Neurosciences)
Show Figures

Figure 1

Back to TopTop