Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,280)

Search Parameters:
Keywords = language statistics

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 1328 KB  
Article
From Divergence to Alignment: Evaluating the Role of Large Language Models in Facilitating Agreement Through Adaptive Strategies
by Loukas Triantafyllopoulos and Dimitris Kalles
Future Internet 2025, 17(9), 407; https://doi.org/10.3390/fi17090407 (registering DOI) - 6 Sep 2025
Abstract
Achieving consensus in group decision-making often involves overcoming significant challenges, particularly reconciling diverse perspectives and mitigating biases hindering agreement. Traditional methods relying on human facilitators are usually constrained by scalability and efficiency, especially in large-scale, fast-paced discussions. To address these challenges, this study [...] Read more.
Achieving consensus in group decision-making often involves overcoming significant challenges, particularly reconciling diverse perspectives and mitigating biases hindering agreement. Traditional methods relying on human facilitators are usually constrained by scalability and efficiency, especially in large-scale, fast-paced discussions. To address these challenges, this study proposes a novel real-time facilitation framework, employing large language models (LLMs) as automated facilitators within a custom-built multi-user chat system. This framework is distinguished by its real-time adaptive system architecture, which enables dynamic adjustments to facilitation strategies based on ongoing discussion dynamics. Leveraging cosine similarity as a core metric, this approach evaluates the ability of three state-of-the-art LLMs—ChatGPT 4.0, Mistral Large 2, and AI21 Jamba-Instruct—to synthesize consensus proposals that align with participants’ viewpoints. Unlike conventional techniques, the system integrates adaptive facilitation strategies, including clarifying misunderstandings, summarizing discussions, and proposing compromises, enabling the LLMs to refine consensus proposals based on user feedback iteratively. Experimental results indicate that ChatGPT 4.0 achieved the highest alignment with participant opinions and required fewer iterations to reach consensus. A one-way ANOVA confirmed that differences in performance between models were statistically significant. Moreover, descriptive analyses revealed nuanced differences in model behavior across various sustainability-focused discussion topics, including climate action, quality education, good health and well-being, and access to clean water and sanitation. These findings highlight the promise of LLM-driven facilitation for improving collective decision-making processes and underscore the need for further research into robust evaluation metrics, ethical considerations, and cross-cultural adaptability. Full article
Show Figures

Figure 1

18 pages, 8435 KB  
Article
Modeling Sentiment–Hydrology Interaction Using LLM: Insights for Adaptive Governance in Ceará’s Water Management
by Tatiane Lima Batista, Ticiana Marinho de Carvalho Studart, Marlon Gonçalves Duarte and Francisco de Assis de Souza Filho
Water 2025, 17(17), 2615; https://doi.org/10.3390/w17172615 - 4 Sep 2025
Viewed by 147
Abstract
This study aims to analyze the relationships between concerns and sentiments of stakeholders and the drought stage in a semi-arid region of Ceará from Language Technologies based on Artificial Intelligence. The dataset comprises 36 meeting minutes of water management bodies (2007–2024), of which [...] Read more.
This study aims to analyze the relationships between concerns and sentiments of stakeholders and the drought stage in a semi-arid region of Ceará from Language Technologies based on Artificial Intelligence. The dataset comprises 36 meeting minutes of water management bodies (2007–2024), of which 17 correspond to dry periods and 19 to normal periods (reservoir volume > 50%). Natural Language Processing (NLP) techniques were applied to generate word clouds, and sentiment analysis was performed using a Large Language Model (Llama 3.2, 3B). Sentiment scores were compared with reservoir volume data. Results show that both perceptions and themes differed between drought and normal phases, with higher water availability coinciding with more positive sentiments. A moderate positive correlation was found between sentiment and reservoir volume (r = 0.53, p = 0.00095, 95% CI [0.24, 0.73]). Statistical tests confirmed differences between periods (Welch’s t-test, p = 0.0018; Mann-Whitney, p = 0.0039). Box-plot analyses indicated that over 75% of sentiments were positive in normal phases, while about 65% were negative in drought phases. These findings highlight the sensitivity of human perceptions to hydrological conditions and point to the potential of LLMs as innovative instruments for integrating qualitative data into complex socio-environmental analyses. Full article
(This article belongs to the Special Issue Application of Hydrological Modelling to Water Resources Management)
Show Figures

Graphical abstract

23 pages, 4541 KB  
Article
A Simulation-Based Risk Assessment Model for Comparative Analysis of Collisions in Autonomous and Non-Autonomous Haulage Trucks
by Malihe Goli, Amin Moniri-Morad, Mario Aguilar, Masoud S. Shishvan, Mahdi Shahsavar and Javad Sattarvand
Appl. Sci. 2025, 15(17), 9702; https://doi.org/10.3390/app15179702 - 3 Sep 2025
Viewed by 133
Abstract
The implementation of autonomous haulage trucks in open-pit mines represents a progressive advancement in the mining industry, but it poses potential safety risks that require thorough assessment. This study proposes an integrated model that combines discrete-event simulation (DES) with a risk matrix to [...] Read more.
The implementation of autonomous haulage trucks in open-pit mines represents a progressive advancement in the mining industry, but it poses potential safety risks that require thorough assessment. This study proposes an integrated model that combines discrete-event simulation (DES) with a risk matrix to assess collisions associated with three different operational scenarios, including non-autonomous, hybrid, and fully autonomous truck operations. To achieve these objectives, a comprehensive dataset was collected and analyzed using statistical models and natural language processing (NLP) techniques. Multiple scenarios were then developed and simulated to compare the risks of collision and evaluate the impact of eliminating human intervention in hauling operations. A risk matrix was designed to assess the collision likelihood and risk severity of collisions in each scenario, emphasizing the impact on both human safety and project operations. The results revealed an inverse relationship between the number of autonomous trucks and the frequency of collisions, underscoring the potential safety advantages of fully autonomous operations. The collision probabilities show an improvement of approximately 91.7% and 90.7% in the third scenario compared to the first and second scenarios, respectively. Furthermore, high-risk areas were identified at intersections with high traffic. These findings offer valuable insights into enhancing safety protocols and integrating advanced monitoring technologies in open-pit mining operations, particularly those utilizing autonomous haulage truck fleets. Full article
Show Figures

Figure 1

26 pages, 740 KB  
Article
Enhancement of the Generation Quality of Generative Linguistic Steganographic Texts by a Character-Based Diffusion Embedding Algorithm (CDEA)
by Yingquan Chen, Qianmu Li, Aniruddha Bhattacharjya, Xiaocong Wu, Huifeng Li, Qing Chang, Le Zhu and Yan Xiao
Appl. Sci. 2025, 15(17), 9663; https://doi.org/10.3390/app15179663 - 2 Sep 2025
Viewed by 158
Abstract
Generative linguistic steganography aims to produce texts that remain both perceptually and statistically imperceptible. The existing embedding algorithms often suffer from imbalanced candidate selection, where high-probability words are overlooked and low-probability words dominate, leading to reduced coherence and fluency. We introduce a character-based [...] Read more.
Generative linguistic steganography aims to produce texts that remain both perceptually and statistically imperceptible. The existing embedding algorithms often suffer from imbalanced candidate selection, where high-probability words are overlooked and low-probability words dominate, leading to reduced coherence and fluency. We introduce a character-based diffusion embedding algorithm (CDEA) that uniquely leverages character-level statistics and a power-law-inspired grouping strategy to better balance candidate word selection. Unlike prior methods, the proposed CDEA explicitly prioritizes high-probability candidates, thereby improving both semantic consistency and text naturalness. When combined with XLNet, it effectively generates longer sensitive sequences while preserving quality. The experimental results showed that CDEA not only produces steganographic texts with higher imperceptibility and fluency but also achieves stronger resistance to steganalysis compared with the existing approaches. Future work will be to enhance statistical imperceptibility, integrate CDEA with larger language models such as GPT-5, and extend applications to cross-lingual, multimodal, and practical IoT or blockchain communication scenarios. Full article
(This article belongs to the Special Issue Cyber Security and Software Engineering)
59 pages, 3596 KB  
Review
Beginner-Friendly Review of Research on R-Based Energy Forecasting: Insights from Text Mining
by Minjoong Kim, Hyeonwoo Kim and Jihoon Moon
Electronics 2025, 14(17), 3513; https://doi.org/10.3390/electronics14173513 - 2 Sep 2025
Viewed by 172
Abstract
Data-driven forecasting is becoming increasingly central to modern energy management, yet nonspecialists without a background in artificial intelligence (AI) face significant barriers to entry. While Python is the dominant machine learning language, R remains a practical and accessible tool for users with expertise [...] Read more.
Data-driven forecasting is becoming increasingly central to modern energy management, yet nonspecialists without a background in artificial intelligence (AI) face significant barriers to entry. While Python is the dominant machine learning language, R remains a practical and accessible tool for users with expertise in statistics, engineering, or domain-specific analysis. To inform tool selection, we first provide an evidence-based comparison of R with major alternatives before reviewing 49 peer-reviewed articles published between 2020 and 2025 in Science Citation Index Expanded (SCIE)-level journals that utilized R for energy forecasting tasks, including electricity (regional and site-level), solar, wind, thermal energy, and natural gas. Despite such growth, the field still lacks a systematic, cross-domain synthesis that clarifies which R-based methods prevail, how accessible workflows are implemented, and where methodological gaps remain; this motivated our use of text mining. Text mining techniques were employed to categorize the literature according to forecasting objectives, modeling methods, application domains, and tool usage patterns. The results indicate that tree-based ensemble learning models—e.g., random forests, gradient boosting, and hybrid variants—are employed most frequently, particularly for solar and short-term load forecasting. Notably, few studies incorporated automated model selection or explainable AI; however, there is a growing shift toward interpretable and beginner-friendly workflows. This review offers a practical reference for nonexperts seeking to apply R in energy forecasting contexts, emphasizing accessible modeling strategies and reproducible practices. We also curate example R scripts, workflow templates, and a study-level link catalog to support replication. The findings of this review support the broader democratization of energy analytics by identifying trends and methodologies suitable for users without advanced AI training. Finally, we synthesize domain-specific evidence and outline the text-mining pipeline, present visual keyword profiles and comparative performance tables that surface prevailing strategies and unmet needs, and conclude with practical guidance and targeted directions for future research. Full article
Show Figures

Figure 1

20 pages, 564 KB  
Review
Neurodevelopmental Outcomes in Children Born to Mothers Infected with SARS-CoV-2 During Pregnancy: A Narrative Review
by Daniela Păcurar, Alexandru Dinulescu, Ana Prejmereanu, Alexandru Cosmin Palcău, Irina Dijmărescu and Mirela-Luminița Pavelescu
J. Clin. Med. 2025, 14(17), 6202; https://doi.org/10.3390/jcm14176202 - 2 Sep 2025
Viewed by 181
Abstract
Background: The potential impact of maternal SARS-CoV-2 infection during pregnancy on the neurodevelopment of offspring has raised considerable concern. Emerging studies have evaluated various developmental domains in exposed infants, yet findings remain inconsistent. Objective: To synthesize current evidence regarding neurodevelopmental outcomes [...] Read more.
Background: The potential impact of maternal SARS-CoV-2 infection during pregnancy on the neurodevelopment of offspring has raised considerable concern. Emerging studies have evaluated various developmental domains in exposed infants, yet findings remain inconsistent. Objective: To synthesize current evidence regarding neurodevelopmental outcomes in infants born to mothers with confirmed SARS-CoV-2 infection during pregnancy. Methods: We conducted a narrative review following PRISMA guidelines. A literature search was performed in PubMed, Cochrane, and ScienceDirect using keywords including “COVID-19”, “pregnancy”, “neurodevelopment”, and “SARS-CoV-2”. Nineteen studies were included. Data were extracted regarding study design, sample size, timing of exposure, age at assessment, developmental tools used, and key findings. Study quality was assessed using the Newcastle–Ottawa Scale. Results: Among 19 included studies, 12 reported at least some neurodevelopmental delays, particularly in motor and language domains. However, these delays were generally mild, domain-specific, and often not statistically significant. Seven studies, most of which were high-quality and low-risk, reported no significant differences between exposed and unexposed groups. Assessment tools and follow-up durations varied widely, limiting comparability. Conclusions: Current evidence does not support a consistent association between in utero SARS-CoV-2 exposure and an unfavorable neurodevelopmental outcome up to 24 months. However, heterogeneity in methods and short-term follow-up warrant further high-quality longitudinal research. Full article
(This article belongs to the Special Issue New Advances in COVID-19 and Pregnancy)
Show Figures

Figure 1

34 pages, 1992 KB  
Article
Future Skills in the GenAI Era: A Labor Market Classification System Using Kolmogorov–Arnold Networks and Explainable AI
by Dimitrios Christos Kavargyris, Konstantinos Georgiou, Eleanna Papaioannou, Theodoros Moysiadis, Nikolaos Mittas and Lefteris Angelis
Algorithms 2025, 18(9), 554; https://doi.org/10.3390/a18090554 - 2 Sep 2025
Viewed by 158
Abstract
Generative Artificial Intelligence (GenAI) is widely recognized for its profound impact on labor market demand, supply, and skill dynamics. However, due to its transformative nature, GenAI increasingly overlaps with traditional AI roles, blurring boundaries and intensifying the need to reassess workforce competencies. To [...] Read more.
Generative Artificial Intelligence (GenAI) is widely recognized for its profound impact on labor market demand, supply, and skill dynamics. However, due to its transformative nature, GenAI increasingly overlaps with traditional AI roles, blurring boundaries and intensifying the need to reassess workforce competencies. To address this challenge, this paper introduces KANVAS (Kolmogorov–Arnold Network Versatile Algorithmic Solution)—a framework based on Kolmogorov–Arnold Networks (KANs), which utilize B-spline-based, compact, and interpretable neural units—to distinguish between traditional AI roles and emerging GenAI-related positions. The aim of the study is to develop a reliable and interpretable labor market classification system that differentiates these roles using explainable machine learning. Unlike prior studies that emphasize predictive performance, our work is the first to employ KANs as an explanatory tool for labor classification, to reveal how GenAI-related and European Skills, Competences, Qualifications, and Occupations (ESCO)-aligned skills differentially contribute to distinguishing modern from traditional AI job roles. Using raw job vacancy data from two labor market platforms, KANVAS implements a hybrid pipeline combining a state-of-the-art Large Language Model (LLM) with Explainable AI (XAI) techniques, including Shapley Additive Explanations (SHAP), to enhance model transparency. The framework achieves approximately 80% classification consistency between traditional and GenAI-aligned roles, while also identifying the most influential skills contributing to each category. Our findings indicate that GenAI positions prioritize competencies such as prompt engineering and LLM integration, whereas traditional roles emphasize statistical modeling and legacy toolkits. By surfacing these distinctions, the framework offers actionable insights for curriculum design, targeted reskilling programs, and workforce policy development. Overall, KANVAS contributes a novel, interpretable approach to understanding how GenAI reshapes job roles and skill requirements in a rapidly evolving labor market. Finally, the open-source implementation of KANVAS is flexible and well-suited for HR managers and relevant stakeholders. Full article
Show Figures

Figure 1

19 pages, 738 KB  
Review
The Use of Advanced Glycation End-Product Measurements to Predict Post-Operative Complications After Cardiac Surgery
by Divya S. Agrawal, Jose C. Motta and Jason M. Ali
J. Clin. Med. 2025, 14(17), 6176; https://doi.org/10.3390/jcm14176176 - 1 Sep 2025
Viewed by 273
Abstract
Background/Objectives: Frailty is increasingly recognised as an important contributor to outcomes following cardiac surgery. There are various measures of frailty described, but many include subjective assessments impacting reliability and reproducibility of measurement. A potential biomarker: advanced glycation end products (AGEs) have been [...] Read more.
Background/Objectives: Frailty is increasingly recognised as an important contributor to outcomes following cardiac surgery. There are various measures of frailty described, but many include subjective assessments impacting reliability and reproducibility of measurement. A potential biomarker: advanced glycation end products (AGEs) have been suggested to closely correlate with frailty. This may offer the opportunity to objectively measure frailty and have potential use in preoperative risk assessment. The objective and aim of this narrative review is to assess the association between AGEs and outcomes following surgery, in order to evaluate the use of AGEs for preoperative risk assessment. Methods: This review involved searching five databases including the following: MEDLINE (through Ovid), Embase, Cochrane, ClinicalTrials.gov, and a specified Google Scholar search for studies published between database inception and 20 February 2025. The 1142 identified articles were then subjected to various inclusion and exclusion criteria. This exclusion criteria included all articles that were not in the English language, studies involving patients under 18 years of age, and studies that were incomplete or for whom the data was not yet available. This left 11 articles for which a ‘related articles’ search was performed on Google Scholar on 6 March 2025, as per the PRISMA-S extension guidelines, to obtain all relevant articles available. In the end, data analysis was conducted on 13 articles with a total of 2402 participants. These were categorised by type of surgery before analysis was performed for each surgical category. The quality of evidence was assessed using ROBINS-I tool and a risk of bias table has been provided. This study was provided no external sources of funding. Results: Four out of the five studies in cardiac surgery showed a statistically significant association between AGE levels and post-operative complications and outcomes. This association was also seen across thoracic and general surgery. Association was demonstrated with various post-operative complications as well as mortality. These relationships are supported by various pathophysiological mechanisms, including the ability of AGEs to induce oxidative stress, activate inflammatory mediators, and cause endothelial dysfunction. Conclusions: There is a body of evidence supporting the association between AGEs level and cardiac surgical outcomes. This objective measure of frailty could have significant utility in preoperative risk assessment and offer the opportunity to identify patients who will benefit from undergoing prehabilitation. Full article
(This article belongs to the Special Issue Preoperative Optimization in Cardiac Surgery)
Show Figures

Figure A1

15 pages, 416 KB  
Article
Evaluating the Effectiveness of Chatbot-Assisted Learning in Enhancing English Conversational Skills Among Secondary School Students
by Abdullah Alenezi and Abdulhameed Alenezi
Educ. Sci. 2025, 15(9), 1136; https://doi.org/10.3390/educsci15091136 - 1 Sep 2025
Viewed by 281
Abstract
The growing application of artificial intelligence in education has created new avenues for second language learning. The following research explores the impact of learning with the help of chatbots on English conversation among secondary students in the Northern Borders Region in Saudi Arabia. [...] Read more.
The growing application of artificial intelligence in education has created new avenues for second language learning. The following research explores the impact of learning with the help of chatbots on English conversation among secondary students in the Northern Borders Region in Saudi Arabia. The quasi-experimental design involved 30 students divided into two groups: an experimental group that interacted with an intervention using a GPT-powered chatbot for three weeks, and a control group that underwent traditional teaching. Pre- and post-tests were given to assess conversation competence. At the same time, students’ attitudes toward the chatbot-assisted learning experience were measured through questionnaires, teacher observation, and usage logs in the chatbot. Results showed statistically significant improvement in the experimental group’s speaking competence (mean gain = 5.24, p < 0.001). Students showed high motivation, elevated confidence, and high satisfaction with the learning experience provided through the chatbot (overall attitude mean = 4.35/5). Teacher observations testified that the students were much more engaged and spontaneous, and using the chatbot was positively correlated with score gain (r = 0.61). The outcomes indicate that chatbot-based learning is a practical approach for facilitating the development of spoken English, particularly in low-resource learning environments. The research provides empirical proof in favour of the incorporation of interactive AI into EFL teaching in all the secondary schools in Saudi Arabia. Full article
(This article belongs to the Special Issue Computer-Assisted Language Learning at the Dawn of the AI Revolution)
Show Figures

Figure 1

28 pages, 1711 KB  
Article
Identifying Literary Microgenres and Writing Style Differences in Romanian Novels with ReaderBench and Large Language Models
by Aura Cristina Udrea, Stefan Ruseti, Vlad Pojoga, Stefan Baghiu, Andrei Terian and Mihai Dascalu
Future Internet 2025, 17(9), 397; https://doi.org/10.3390/fi17090397 - 30 Aug 2025
Viewed by 248
Abstract
Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary [...] Read more.
Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary historians into one of 17 genres. Our findings reveal that most novels do not adhere to a single genre label but instead combine elements of multiple (micro)genres, challenging traditional single-label classification approaches. We employed a dual computational methodology combining an analysis with Romanian-tailored linguistic features with general-purpose LLMs. ReaderBench, a Romanian-specific framework, was utilized to extract surface, syntactic, semantic, and discourse features, capturing fine-grained linguistic patterns. Alternatively, we prompted two LLMs (Llama3.3 70B and DeepSeek-R1 70B) to predict genres at the paragraph level, leveraging their ability to detect contextual and thematic coherence across multiple narrative scales. Statistical analyses using Kruskal–Wallis and Mann–Whitney tests identified genre-defining features at both novel and chapter levels. The integration of these complementary approaches enhances microgenre detection beyond traditional classification capabilities. ReaderBench provides quantifiable linguistic evidence, while LLMs capture broader contextual patterns; together, they provide a multi-layered perspective on literary genre that reflects the complex and heterogeneous character of fictional texts. Our results argue that both language-specific and general-purpose computational tools can effectively detect stylistic diversity in Romanian fiction, opening new avenues for computational literary analysis in limited-resourced languages. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Figure 1

12 pages, 350 KB  
Article
Women’s Perceptions of Cultural Sensitivity of Midwives During Intrapartum Care in Riyadh, Saudi Arabia
by Abdulaziz M. Alodhialah and Shorok Hamed Alahmedi
Healthcare 2025, 13(17), 2172; https://doi.org/10.3390/healthcare13172172 - 30 Aug 2025
Viewed by 212
Abstract
Background: Cultural sensitivity during intrapartum care is a critical determinant of maternal satisfaction and quality of care, particularly in multicultural settings. In Saudi Arabia, the diversity of birthing women underscores the need for midwives to provide culturally competent, respectful, and individualized care. Objective: [...] Read more.
Background: Cultural sensitivity during intrapartum care is a critical determinant of maternal satisfaction and quality of care, particularly in multicultural settings. In Saudi Arabia, the diversity of birthing women underscores the need for midwives to provide culturally competent, respectful, and individualized care. Objective: To assess women’s perceptions of midwives’ cultural sensitivity during intrapartum care in Riyadh, Saudi Arabia, and identify demographic factors influencing these perceptions. Methods: A quantitative cross-sectional study was conducted using a validated cultural sensitivity questionnaire. Data were collected online through purposive sampling from women who had given birth in the past 12 months. Descriptive statistics summarized participant characteristics and perception scores, while inferential tests examined associations between perceptions and demographic variables. Results: women reported moderate to high perceptions of cultural sensitivity. Age and nationality significantly influenced perception scores (p < 0.05). While communication and respect for religious practices scored highest, areas such as shared decision making and language-concordant support were identified as needing improvement. Conclusions: Women in Riyadh often perceive midwives as culturally sensitive; however, gaps remain in communication and involvement in decision making. Training programs that strengthen midwives’ cultural competence—especially in language services and patient engagement—could enhance the intrapartum experience. Full article
(This article belongs to the Special Issue Advancing Cultural Competence in Health Care)
Show Figures

Figure 1

29 pages, 434 KB  
Article
Comparative Analysis of Natural Language Processing Techniques in the Classification of Press Articles
by Kacper Piasta and Rafał Kotas
Appl. Sci. 2025, 15(17), 9559; https://doi.org/10.3390/app15179559 - 30 Aug 2025
Viewed by 210
Abstract
The study undertook a comprehensive review and comparative analysis of natural language processing techniques for news article classification, with a particular focus on Java language libraries. The dataset comprised an excess of 200,000 items of news metadata sourced from The Huffington Post. The [...] Read more.
The study undertook a comprehensive review and comparative analysis of natural language processing techniques for news article classification, with a particular focus on Java language libraries. The dataset comprised an excess of 200,000 items of news metadata sourced from The Huffington Post. The traditional algorithms based on mathematical statistics and deep machine learning were evaluated. The libraries chosen for tests were Apache OpenNLP, Stanford CoreNLP, Waikato Weka, and the Huggingface ecosystem with the Pytorch backend. The efficacy of the trained models in forecasting specific topics was evaluated, and diverse methodologies for the feature extraction and analysis of word-vector representations were explored. The study considered aspects such as hardware resource management, implementation simplicity, learning time, and the quality of the resulting model in terms of detection, and it examined a range of techniques for attribute selection, feature filtering, vector representation, and the handling of imbalanced datasets. Advanced techniques for word selection and named entity recognition were employed. The study compared different models and configurations in terms of their performance and the resources they consumed. Furthermore, it addressed the difficulties encountered when processing lengthy texts with transformer neural networks, and it presented potential solutions such as sequence truncation and segment analysis. The elevated computational cost inherent to Java-based languages may present challenges in machine learning tasks. OpenNLP model achieved 84% accuracy, Weka and CoreNLP attained 86% and 88%, respectively, and DistilBERT emerged as the top performer, with an accuracy rate of 92%. Deep learning models demonstrated superior performance, training time, and ease of implementation compared to conventional statistical algorithms. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications—2nd Edition)
Show Figures

Figure 1

11 pages, 1958 KB  
Article
Neurodevelopmental Outcomes in Newborns with Congenital Gastrointestinal Atresias
by Duygu Tuncel, Senay Guven Baysal, Tulin Oztaş and Nilüfer Okur
Children 2025, 12(9), 1153; https://doi.org/10.3390/children12091153 - 29 Aug 2025
Viewed by 186
Abstract
Background: The population affected by gastrointestinal atresias and neurodevelopmental outcomes has not been well studied. Current evidence suggests that damage to the central nervous system is important in congenital gastrointestinal malformations. This study aims to understand the effects of gastrointestinal atresias on neurodevelopmental [...] Read more.
Background: The population affected by gastrointestinal atresias and neurodevelopmental outcomes has not been well studied. Current evidence suggests that damage to the central nervous system is important in congenital gastrointestinal malformations. This study aims to understand the effects of gastrointestinal atresias on neurodevelopmental outcomes in our patient group. Methods: This cross-sectional, population-based study examined patients with congenital gastrointestinal atresias who were admitted to the neonatal intensive care unit and underwent gastrointestinal surgery. The Bayley III scale was administered to 32 patients aged 7–42 months. Results: Thirty-two patients with gastrointestinal atresia were included in the study. Eighteen (56.2%) of the patients were male. The median gestational age was 37 weeks (range 25–39 weeks) and the median birthweight was 2700 g (range 700–3800 g). A Bayley III evaluation was performed at a median age of 13.7 months (range 7–41 months). The cognitive, motor, and language composite scores were 90, 86, and 89, respectively. The motor score was lower than the cognitive and language scores. No statistical difference was found between low scores, gender and stoma presence in all three neurodevelopmental categories (p < 0.05). Conclusion: Patients with congenital gastrointestinal malformations are reported in the literature to have lower motor and language development scores. In our study, lower cognitive and language scores were observed in only one patient, whereas motor delay was more prevalent in the study population. The close neurodevelopmental follow-up of infants with gastrointestinal atresies may improve the quality of life. Full article
(This article belongs to the Section Pediatric Neonatology)
Show Figures

Graphical abstract

24 pages, 3212 KB  
Article
Comparative Performance Analysis of Software-Based Restoration Techniques for NAVTEX Message
by Hoyeon Cho, Changui Lee and Seojeong Lee
J. Mar. Sci. Eng. 2025, 13(9), 1657; https://doi.org/10.3390/jmse13091657 - 29 Aug 2025
Viewed by 312
Abstract
Maritime transportation requires reliable navigational safety communications to ensure vessel safety and operational efficiency. The Maritime Single Window (MSW) enables vessels to submit all maritime data digitally without human intervention. NAVTEX (Navigational Telex) messages provide navigational warnings, meteorological warnings and forecasts, piracy, and [...] Read more.
Maritime transportation requires reliable navigational safety communications to ensure vessel safety and operational efficiency. The Maritime Single Window (MSW) enables vessels to submit all maritime data digitally without human intervention. NAVTEX (Navigational Telex) messages provide navigational warnings, meteorological warnings and forecasts, piracy, and search and rescue information that require integration into automated MSW system. However, NAVTEX transmissions experience message corruption when Forward Error Correction (FEC) mechanisms fail, marking unrecoverable characters with asterisks. Current standards require discarding messages exceeding 4% error rates, resulting in safety information loss. Traditional human interpretation of corrupted messages creates limitations that prevent automated MSW integration. This paper presents the application of Masked Language Modeling (MLM) with Transformer encoders for automated NAVTEX message restoration. Our approach treats asterisk characters as masked tokens, enabling bidirectional context processing to reconstruct corrupted characters. We evaluated MLM against dictionary-matching and n-gram models using 69,658 NAVTEX messages with corruption ranging from 1% to 33%. MLM achieved 85.4% restoration rate versus 44.4–64.0% for statistical methods. MLM maintained residual error rates below the 4% threshold for initial corruption up to 25%, while statistical methods exceeded this limit at 10%. This automated restoration capability supports MSW integration while preserving critical safety information during challenging transmission conditions. Full article
Show Figures

Figure 1

25 pages, 4657 KB  
Article
Identifying Methodological Language in Psychology Abstracts: A Machine Learning Approach Using NLP and Embedding-Based Clustering
by Konstantinos G. Stathakis, George Papageorgiou and Christos Tjortjis
Big Data Cogn. Comput. 2025, 9(9), 224; https://doi.org/10.3390/bdcc9090224 - 29 Aug 2025
Viewed by 283
Abstract
Research articles are valuable resources for Information Retrieval and Natural Language Processing (NLP) tasks, offering opportunities to analyze key components of scholarly content. This study investigates the presence of methodological terminology in psychology research over the past 30 years (1995–2024) by applying a [...] Read more.
Research articles are valuable resources for Information Retrieval and Natural Language Processing (NLP) tasks, offering opportunities to analyze key components of scholarly content. This study investigates the presence of methodological terminology in psychology research over the past 30 years (1995–2024) by applying a novel NLP and Machine Learning pipeline to a large corpus of 85,452 abstracts, as well as the extent to which this terminology forms distinct thematic groupings. Combining glossary-based extraction, contextualized language model embeddings, and dual-mode clustering, this study offers a scalable framework for the exploration of methodological transparency in scientific text via deep semantic structures. A curated glossary of 365 method-related keywords served as a gold-standard reference for term identification, using direct and fuzzy string matching. Retrieved terms were encoded with SciBERT, averaging embeddings across contextual occurrences to produce unified vectors. These vectors were clustered using unsupervised and weighted unsupervised approaches, yielding six and ten clusters, respectively. Cluster composition was analyzed using weighted statistical measures to assess term importance within and across groups. A total of 78.16% of the examined abstracts contained glossary terms, with an average of 1.8 term per abstract, highlighting an increasing presence of methodological terminology in psychology and reflecting a shift toward greater transparency in research reporting. This work goes beyond the use of static vectors by incorporating contextual understanding in the examination of methodological terminology, while offering a scalable and generalizable approach to semantic analysis in scientific texts, with implications for meta-research, domain-specific lexicon development, and automated scientific knowledge discovery. Full article
(This article belongs to the Special Issue Machine Learning Applications in Natural Language Processing)
Show Figures

Figure 1

Back to TopTop