Previous Issue
Volume 16, August
 
 

Information, Volume 16, Issue 9 (September 2025) – 91 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
20 pages, 2404 KB  
Article
TFR-LRC: Rack-Optimized Locally Repairable Codes: Balancing Fault Tolerance, Repair Degree, and Topology Awareness in Distributed Storage Systems
by Yan Wang, Yanghuang Cao and Junhao Shi
Information 2025, 16(9), 803; https://doi.org/10.3390/info16090803 (registering DOI) - 15 Sep 2025
Abstract
Locally Repairable Codes (LRCs) have become the dominant design in wide-stripe erasure coding storage systems due to their excellent locality and low repair bandwidth. In such systems, the repair degree—defined as the number of helper nodes contacted during data recovery—is a key performance [...] Read more.
Locally Repairable Codes (LRCs) have become the dominant design in wide-stripe erasure coding storage systems due to their excellent locality and low repair bandwidth. In such systems, the repair degree—defined as the number of helper nodes contacted during data recovery—is a key performance metric. However, as stripe width increases, the probability of multiple simultaneous node failures grows, which significantly raises the repair degree in traditional LRCs. Addressing this challenge, we propose a new family of codes called TFR-LRCs (Locally Repairable Codes for balancing fault tolerance and repair efficiency). TFR-LRCs introduce flexible design choices that allow trade-offs between fault tolerance and repair degree: they can reduce the repair degree by slightly increasing storage overhead, or enhance fault tolerance by tolerating a slightly higher repair degree. We design a matrix-based construction to generate TFR-LRCs and evaluate their performance through extensive simulations. The results show that, under multiple failure scenarios, TFR-LRC reduces the repair degree by up to 35% compared with conventional LRCs, while preserving the original LRC structure. Moreover, under identical code parameters, TFR-LRC achieves improved fault tolerance, tolerating up to g+2 failures versus g+1 in conventional LRCs, with minimal additional cost. Notably, in maintenance mode, where entire racks may become temporarily unavailable, TFR-LRC demonstrates substantially better recovery efficiency compared to existing LRC schemes, making it a practical choice for real-world deployments. Full article
Show Figures

Figure 1

23 pages, 1899 KB  
Article
A Container-Native IAM Framework for Secure Green Mobility: A Case Study with Keycloak and Kubernetes
by Alexandre Sousa, Frederico Branco, Arsénio Reis and Manuel J. C. S. Reis
Information 2025, 16(9), 802; https://doi.org/10.3390/info16090802 (registering DOI) - 15 Sep 2025
Abstract
The rapid adoption of green mobility solutions—such as electric-vehicle sharing and intelligent transportation systems—has accelerated the integration of Internet of Things (IoT) technologies, introducing complex security and performance challenges. While conceptual Identity and Access Management (IAM) frameworks exist, few are empirically validated for [...] Read more.
The rapid adoption of green mobility solutions—such as electric-vehicle sharing and intelligent transportation systems—has accelerated the integration of Internet of Things (IoT) technologies, introducing complex security and performance challenges. While conceptual Identity and Access Management (IAM) frameworks exist, few are empirically validated for the scale, heterogeneity, and real-time demands of modern mobility ecosystems. This work presents a data-backed, container-native reference architecture for secure and resilient Authentication, Authorization, and Accounting (AAA) in green mobility environments. The framework integrates Keycloak within a Kubernetes-orchestrated infrastructure and applies Zero Trust and defense-in-depth principles. Effectiveness is demonstrated through rigorous benchmarking across latency, throughput, memory footprint, and automated fault recovery. Compared to a monolithic baseline, the proposed architecture achieves over 300% higher throughput, 90% faster startup times, and 75% lower idle memory usage while enabling full service restoration in under one minute. This work establishes a validated deployment blueprint for IAM in IoT-driven transportation systems, offering a practical foundation for a secure and scalable mobility infrastructure. Full article
13 pages, 382 KB  
Article
The Blockchain Trust Paradox: Engineered Trust vs. Experienced Trust in Decentralized Systems
by Scott Keaney and Pierre Berthon
Information 2025, 16(9), 801; https://doi.org/10.3390/info16090801 (registering DOI) - 15 Sep 2025
Abstract
Blockchain is described as a technology of trust. Its design relies on cryptography, decentralization, and immutability to ensure secure and transparent transactions. Yet users frequently report confusion, frustration, and skepticism when engaging with blockchain applications. This tension is the blockchain trust paradox: while [...] Read more.
Blockchain is described as a technology of trust. Its design relies on cryptography, decentralization, and immutability to ensure secure and transparent transactions. Yet users frequently report confusion, frustration, and skepticism when engaging with blockchain applications. This tension is the blockchain trust paradox: while trust is engineered into the technology, trust is not always experienced by its users. Our article examines the paradox through three theoretical perspectives. Socio-Technical Systems (STS) theory highlights how trust emerges from the interaction between technical features and social practices; Technology Acceptance models (TAM and UTAUT) emphasize how perceived usefulness and ease of use shape adoption. Ostrom’s commons governance theory explains how legitimacy and accountability affect trust in decentralized networks. Drawing on recent research in experience design, human–computer interaction, and decentralized governance, the article identifies the barriers that undermine user confidence. These include complex key management, unpredictable transaction costs, and unclear processes for decision-making and dispute resolution. The article offers an integrated framework that links engineered trust with experienced trust. Seven propositions are developed to guide future research and practice. The conclusion argues that blockchain technologies will gain traction if design and governance evolve alongside technical protocols to create systems that are both technically secure and trustworthy in experience. Full article
(This article belongs to the Special Issue Information Technology in Society)
Show Figures

Graphical abstract

17 pages, 1659 KB  
Article
Enhancing Multi-Region Target Search Efficiency Through Integrated Peripheral Vision and Head-Mounted Display Systems
by Gang Wang, Hung-Hsiang Wang and Zhihuang Huang
Information 2025, 16(9), 800; https://doi.org/10.3390/info16090800 - 15 Sep 2025
Abstract
Effectively managing visual search tasks across multiple spatial regions during daily activities such as driving, cycling, and navigating complex environments often overwhelms visual processing capacity, increasing the risk of errors and missed critical information. This study investigates an integrated approach that combines an [...] Read more.
Effectively managing visual search tasks across multiple spatial regions during daily activities such as driving, cycling, and navigating complex environments often overwhelms visual processing capacity, increasing the risk of errors and missed critical information. This study investigates an integrated approach that combines an Ambient Display system utilizing peripheral vision cues with traditional Head-Mounted Displays (HMDs) to enhance spatial search efficiency while minimizing cognitive burden. We systematically evaluated this integrated HMD-Ambient Display system against standalone HMD configurations through comprehensive user studies involving target search scenarios across multiple spatial regions. Our findings demonstrate that the combined approach significantly improves user performance by establishing a complementary visual system where peripheral stimuli effectively capture initial attention while central HMD cues provide precise directional guidance. The integrated system showed substantial improvements in reaction time for rear visual region searches and higher user preference ratings compared with HMD-only conditions. This integrated approach represents an innovative solution that efficiently utilizes dual visual channels, reducing cognitive load while enhancing search efficiency across distributed spatial areas. Our contributions provide valuable design guidelines for developing assistive technologies that improve performance in multi-region visual search tasks by strategically leveraging the complementary strengths of peripheral and central visual processing mechanisms. Full article
Show Figures

Figure 1

18 pages, 6253 KB  
Article
Exploring Sign Language Dataset Augmentation with Generative Artificial Intelligence Videos: A Case Study Using Adobe Firefly-Generated American Sign Language Data
by Valentin Bercaru and Nirvana Popescu
Information 2025, 16(9), 799; https://doi.org/10.3390/info16090799 - 15 Sep 2025
Abstract
Currently, high quality datasets focused on Sign Language Recognition are either private, proprietary or difficult to obtain due to costs. Therefore, we aim to mitigate this problem by augmenting a publicly available dataset with artificially generated data in order to enrich and obtain [...] Read more.
Currently, high quality datasets focused on Sign Language Recognition are either private, proprietary or difficult to obtain due to costs. Therefore, we aim to mitigate this problem by augmenting a publicly available dataset with artificially generated data in order to enrich and obtain a more diverse dataset. The performance of Sign Language Recognition (SLR) systems is highly dependent on the quality and diversity of training datasets. However, acquiring large-scale and well-annotated sign language video data remains a significant challenge. This experiment explores the use of Generative Artificial Intelligence (GenAI), specifically Adobe Firefly, to create synthetic video data for American Sign Language (ASL) fingerspelling. Thirteen letters out of 26 were selected for generation, and short videos representing each sign were synthesized and processed into static frames. These synthetic frames replaced approximately 7.5% of the original dataset and were integrated into the training data of a publicly available Convolutional Neural Network (CNN) model. After retraining the model with the augmented dataset, the accuracy did not drop. Moreover, the validation accuracy was approximately the same. The resulting model achieved a maximum accuracy of 98.04%. While the performance gain was limited (less than 1%), the approach illustrates the feasibility of using GenAI tools to generate training data and supports further research into data augmentation for low-resource SLR tasks. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Graphical abstract

25 pages, 3276 KB  
Article
CPB-YOLOv8: An Enhanced Multi-Scale Traffic Sign Detector for Complex Road Environment
by Wei Zhao, Lanlan Li and Xin Gong
Information 2025, 16(9), 798; https://doi.org/10.3390/info16090798 - 15 Sep 2025
Abstract
Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the [...] Read more.
Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the YOLOv8 architecture. A Cross-Stage Partial-Partitioned Transformer Block (CSP-PTB) is incorporated into the feature extraction stage to preserve semantic information during downsampling while enhancing global feature representation. For feature fusion, a four-level bidirectional feature pyramid BiFPN integrated with a P2 detection layer significantly improves small-target detection capability. Further enhancement is achieved via an optimized loss function that balances multi-scale objective localization. Comprehensive evaluations were conducted on the TT100K, the CCTSDB, and a custom multi-scenario road image dataset capturing urban and suburban environments at 1920 × 1080 resolution. Results demonstrate compelling performance: On TT100K, CPB-YOLOv8 achieved 90.73% mAP@0.5 with a 12.5 MB model size, exceeding the YOLOv8s baseline by 3.94 percentage points and achieving 6.43% higher small-target recall. On CCTSDB, it attained a near-saturation performance of 99.21% mAP@0.5. Crucially, the model demonstrated exceptional robustness across diverse environmental conditions. Rigorous analysis on partitioned CCTSDB subsets based on weather and illumination, alongside validation using a separate self-collected dataset reserved solely for inference, confirmed strong adaptability to real-world distribution shifts and low-visibility scenarios. Cross-dataset validation and visual comparisons further substantiated the model’s robustness and its effective suppression of background interference. Full article
Show Figures

Graphical abstract

24 pages, 2377 KB  
Article
A FinTech-Aligned Optimization Framework for IoT-Enabled Smart Agriculture to Mitigate Greenhouse Gas Emissions
by Sofia Polymeni, Dimitrios N. Skoutas, Georgios Kormentzas and Charalabos Skianis
Information 2025, 16(9), 797; https://doi.org/10.3390/info16090797 - 14 Sep 2025
Abstract
With agriculture being the second biggest contributor to greenhouse gas (GHG) emissions through the excessive use of fertilizers, machinery, and inefficient farming practices, global efforts to reduce emissions have been intensified, opting for smarter, data-driven solutions. However, while machine learning (ML) offers powerful [...] Read more.
With agriculture being the second biggest contributor to greenhouse gas (GHG) emissions through the excessive use of fertilizers, machinery, and inefficient farming practices, global efforts to reduce emissions have been intensified, opting for smarter, data-driven solutions. However, while machine learning (ML) offers powerful predictive capabilities, its black-box nature presents a challenge for trust and adoption, particularly when integrated with auditable financial technology (FinTech) principles. To address this gap, this work introduces a novel, explanation-focused GHG emission optimization framework for IoT-enabled smart agriculture that is both transparent and prescriptive, distinguishing itself from macro-level land-use solutions by focusing on optimizable management practices while aligning with core FinTech principles and pollutant stock market mechanisms. The framework employs a two-stage statistical methodology that first identifies distinct agricultural emission profiles from macro-level data, and then models these emissions by developing a cluster-oriented principal component regression (PCR) model, which outperforms simpler variants by approximately 35% on average across all clusters. This interpretable model then serves as the core of a FinTech-aligned optimization framework that combines cluster-oriented modeling knowledge with a sequential least squares quadratic programming (SLSQP) algorithm to minimize emission-related costs under a carbon pricing mechanism, showcasing forecasted cost reductions as high as 43.55%. Full article
(This article belongs to the Special Issue Technoeconomics of the Internet of Things)
Show Figures

Graphical abstract

22 pages, 785 KB  
Article
Detection of Fake News in Romanian: LLM-Based Approaches to COVID-19 Misinformation
by Alexandru Dima, Ecaterina Ilis, Diana Florea and Mihai Dascalu
Information 2025, 16(9), 796; https://doi.org/10.3390/info16090796 - 13 Sep 2025
Viewed by 39
Abstract
The spread of misinformation during the COVID-19 pandemic raised widespread concerns about public health communication and media reliability. In this study, we focus on these issues as they manifested in Romanian-language media and employ Large Language Models (LLMs) to classify misinformation, with a [...] Read more.
The spread of misinformation during the COVID-19 pandemic raised widespread concerns about public health communication and media reliability. In this study, we focus on these issues as they manifested in Romanian-language media and employ Large Language Models (LLMs) to classify misinformation, with a particular focus on super-narratives—broad thematic categories that capture recurring patterns and ideological framings commonly found in pandemic-related fake news, such as anti-vaccination discourse, conspiracy theories, or geopolitical blame. While some of the categories reflect global trends, others are shaped by the Romanian cultural and political context. We introduce a novel dataset of fake news centered on COVID-19 misinformation in the Romanian geopolitical context, comprising both annotated and unannotated articles. We experimented with multiple LLMs using zero-shot, few-shot, supervised, and semi-supervised learning strategies, achieving the best results with an LLaMA 3.1 8B model and semi-supervised learning, which yielded an F1-score of 78.81%. Experimental evaluations compared this approach to traditional Machine Learning classifiers augmented with morphosyntactic features. Results show that semi-supervised learning substantially improved classification results in both binary and multi-class settings. Our findings highlight the effectiveness of semi-supervised adaptation in low-resource, domain-specific contexts, as well as the necessity of enabling real-time misinformation tracking and enhancing transparency through claim-level explainability and fact-based counterarguments. Full article
Show Figures

Figure 1

15 pages, 374 KB  
Article
Digital Governance, Democracy and Public Funding Efficiency in the EU-27: Comparative Insights with Emphasis on Greece
by Kyriaki Efthalitsidou, Konstantinos Spinthiropoulos, George Vittas and Nikolaos Sariannidis
Information 2025, 16(9), 795; https://doi.org/10.3390/info16090795 - 12 Sep 2025
Viewed by 111
Abstract
This study explores the relationship between digital governance, democratic quality, and public funding efficiency across the EU-27, with an emphasis on Greece. Using 2023 cross-sectional data from the DESI, Worldwide Governance Indicators, and Eurostat, we apply OLS regression and simulated DEA to assess [...] Read more.
This study explores the relationship between digital governance, democratic quality, and public funding efficiency across the EU-27, with an emphasis on Greece. Using 2023 cross-sectional data from the DESI, Worldwide Governance Indicators, and Eurostat, we apply OLS regression and simulated DEA to assess how digital maturity and democratic engagement impact fiscal performance. The sample includes all 27 EU member states, and the analysis is subject to limitations due to the cross-sectional design and the use of simulated DEA scores. Results show that higher DESI and Voice and Accountability scores are positively associated with greater efficiency. Greece, while improving, remains below the EU average. The novelty of this paper lies in combining econometric regression with efficiency benchmarking, highlighting the interplay of digital and democratic dimensions in fiscal performance. The findings highlight the importance of integrating digital infrastructure with participatory governance to achieve sustainable public finance. Full article
(This article belongs to the Special Issue Information Technology in Society)
Show Figures

Figure 1

44 pages, 1085 KB  
Article
EDTF: A User-Centered Approach to Digital Educational Games Design and Development
by Raluca Ionela Maxim and Joan Arnedo-Moreno
Information 2025, 16(9), 794; https://doi.org/10.3390/info16090794 - 12 Sep 2025
Viewed by 179
Abstract
The creation of digital educational games often lacks strong user-centered design despite available frameworks, which tend to focus on technical and instructional aspects. This paper presents the Empathic Design Thinking Framework (EDTF), a structured methodology tailored to digital educational game creation. Rooted in [...] Read more.
The creation of digital educational games often lacks strong user-centered design despite available frameworks, which tend to focus on technical and instructional aspects. This paper presents the Empathic Design Thinking Framework (EDTF), a structured methodology tailored to digital educational game creation. Rooted in human–computer interaction (HCI) principles, the EDTF integrates continuous co-design and iterative user research from ideation to deployment, involving both learners and instructors throughout all phases; it positions empathic design (ED) principles as an important component of HCI, focusing not only on identifying user needs but also on understanding users’ lived experiences, motivations, and frustrations. Developed through design science research, the EDTF offers step-by-step guidance, comprised of 10 steps, that reduces uncertainty for novice and experienced designers, developers, and HCI experts alike. The framework was validated in two robust phases. First, it was evaluated by 60 instructional game experts, including designers, developers, and HCI professionals, using an adapted questionnaire covering dimensions like clarity, problem-solving, consistency, and innovation, as well as standardized scales such as UMUX-Lite for perceived ease of use and usefulness and SUS for perceived usability. This was followed by in-depth interviews with 18 experts to understand the feasibility and conceptualization of EDTF applicability. The strong validation results highlight the framework’s potential to guide the design and development of educational games that take into account HCI principles and are usable, efficient, and impactful. Full article
(This article belongs to the Special Issue Recent Advances and Perspectives in Human-Computer Interaction)
Show Figures

Graphical abstract

41 pages, 1953 KB  
Article
Balancing Business, IT, and Human Capital: RPA Integration and Governance Dynamics
by José Cascais Brás, Ruben Filipe Pereira, Marcella Melo, Isaias Scalabrin Bianchi and Rui Ribeiro
Information 2025, 16(9), 793; https://doi.org/10.3390/info16090793 - 12 Sep 2025
Viewed by 269
Abstract
In the era of rapid technological progress, Robotic Process Automation (RPA) has emerged as a pivotal tool across professional domains. Organizations pursue automation to boost efficiency and productivity, control costs, and reduce errors. RPA software automates repetitive, rules-based tasks previously performed by employees, [...] Read more.
In the era of rapid technological progress, Robotic Process Automation (RPA) has emerged as a pivotal tool across professional domains. Organizations pursue automation to boost efficiency and productivity, control costs, and reduce errors. RPA software automates repetitive, rules-based tasks previously performed by employees, and its effectiveness depends on integration across the business–IT–people interface. We adopted a mixed-methods study combining a PRISMA-guided multivocal review of peer-reviewed and gray sources with semi-structured practitioner interviews to capture firsthand insights and diverse perspectives. Triangulation of these phases examines RPA governance, auditing, and policy. The study clarifies the relationship between business processes and IT and offers guidance that supports procedural standardization, regulatory compliance, employee engagement, role clarity, and effective change management—thereby increasing the likelihood of successful RPA initiatives while prudently mitigating associated risks. Full article
Show Figures

Figure 1

16 pages, 1697 KB  
Article
Enhancing Ancient Ceramic Knowledge Services: A Question Answering System Using Fine-Tuned Models and GraphRAG
by Zhi Chen and Bingxiang Liu
Information 2025, 16(9), 792; https://doi.org/10.3390/info16090792 - 11 Sep 2025
Viewed by 104
Abstract
To address the challenges of extensive domain expertise and deficient semantic comprehension in the digital preservation of ancient ceramics, this paper proposes a knowledge question answering (QA) system integrating Low-Rank Adaptation (LoRA) fine-tuning and Graph Retrieval-Augmented Generation (GraphRAG). First, textual information of ceramic [...] Read more.
To address the challenges of extensive domain expertise and deficient semantic comprehension in the digital preservation of ancient ceramics, this paper proposes a knowledge question answering (QA) system integrating Low-Rank Adaptation (LoRA) fine-tuning and Graph Retrieval-Augmented Generation (GraphRAG). First, textual information of ceramic images is generated using the GLM-4V-9B model. These texts are then enriched with domain literature to produce ancient ceramic QA pairs via ERNIE 4.0 Turbo, culminating in a high-quality dataset of 2143 curated question–answer groups after manual refinement. Second, LoRA fine-tuning was employed on the Qwen2.5-7B-Instruct foundation model, significantly enhancing its question-answering proficiency specifically for the ancient ceramics domain. Finally, the GraphRAG framework is integrated, combining the fine-tuned large language model with knowledge graph path analysis to augment multi-hop reasoning capabilities for complex queries. Experimental results demonstrate performance improvements of 24.08% in ROUGE-1, 34.75% in ROUGE-2, 29.78% in ROUGE-L, and 4.52% in BERTScore_F1 over the baseline model. This evidence shows that the synergistic implementation of LoRA fine-tuning and GraphRAG delivers significant performance enhancements for ceramic knowledge systems, establishing a replicable technical framework for intelligent cultural heritage knowledge services. Full article
Show Figures

Figure 1

40 pages, 471 KB  
Review
Theory and Metatheory in the Nature of Information: Review and Thematic Analysis
by Luke Tredinnick
Information 2025, 16(9), 791; https://doi.org/10.3390/info16090791 - 11 Sep 2025
Viewed by 258
Abstract
This paper addresses the nature of information through a thematic review of the literature. The nature of information describes its fundamental qualities, including structure, meaning, content and use. This paper reviews the critical and theoretical literature with the aim of defining the boundaries [...] Read more.
This paper addresses the nature of information through a thematic review of the literature. The nature of information describes its fundamental qualities, including structure, meaning, content and use. This paper reviews the critical and theoretical literature with the aim of defining the boundaries of a foundational theory of information. The paper is divided into three parts. The first part addresses metatheoretical aspects of the discourse, including the historicity of information, its conceptual ambiguity, the problem of definition, and the possibility of a foundational theory. The second part addresses key dimension of the critical discourse, including the subjective, objective and intersubjective nature of information, its relationship to meaning, and its relationship to the material world. The final part summarises the main conclusion and outlines the scope of a foundational theory. This paper highlights important gaps in the critical tradition, including the historicity of information, and in its relationship to material reality, complexity and computation. This paper differs from prior reviews in its thematic focus and consideration of metatheoretical aspects of the critical and theoretical tradition. Full article
(This article belongs to the Special Issue Advances in Information Studies)
18 pages, 1061 KB  
Article
HiPC-QR: Hierarchical Prompt Chaining for Query Reformulation
by Hua Yang, Hanyang Li and Teresa Gonçalves
Information 2025, 16(9), 790; https://doi.org/10.3390/info16090790 - 11 Sep 2025
Viewed by 161
Abstract
Query reformulation techniques optimize user queries to better align with documents, thus improving the performance of Information Retrieval (IR) systems. Previous methods have primarily focused on query expansion using techniques such as synonym replacement to improve recall. With the rapid advancement of Large [...] Read more.
Query reformulation techniques optimize user queries to better align with documents, thus improving the performance of Information Retrieval (IR) systems. Previous methods have primarily focused on query expansion using techniques such as synonym replacement to improve recall. With the rapid advancement of Large Language Models (LLMs), the knowledge embedded within these models has grown. Research in prompt engineering has introduced various methods, with prompt chaining proving particularly effective for complex tasks. Directly prompting LLMs to reformulate queries has become a viable approach. However, existing LLM-based prompt methods for query reformulation often introduce irrelevant content into reformulated queries, resulting in decreased retrieval precision and misalignment with user intent. We propose a novel approach called Hierarchical Prompt Chaining for Query Reformulation (HiPC-QR). HiPC-QR employs a two-step prompt chaining technique to extract keywords from the original query and refine its structure by filtering out non-essential keywords based on the user’s query intent. This process reduces the query’s restrictiveness while simultaneously expanding essential keywords to enhance retrieval effectiveness. We evaluated the effectiveness of HiPC-QR on two benchmark retrieval datasets, namely MS MARCO and TREC Deep Learning.The experimental results show that HiPC-QR outperforms existing query reformulation methods on large-scale datasets in terms of both recall@10 and MRR@10. Full article
(This article belongs to the Special Issue Machine Learning and Data Mining: Innovations in Big Data Analytics)
Show Figures

Figure 1

33 pages, 2139 KB  
Article
Dengue Fever Detection Using Swarm Intelligence and XGBoost Classifier: An Interpretable Approach with SHAP and DiCE
by Proshenjit Sarker, Jun-Jiat Tiang and Abdullah-Al Nahid
Information 2025, 16(9), 789; https://doi.org/10.3390/info16090789 - 10 Sep 2025
Viewed by 117
Abstract
Dengue fever is a mosquito-borne viral disease that annually affects 100–400 million people worldwide. Early detection of dengue enables easy treatment planning and helps reduce mortality rates. This study proposes three Swarm-based Metaheuristic Algorithms, Golden Jackal Optimization, Fox Optimizer, and Sea Lion Optimization, [...] Read more.
Dengue fever is a mosquito-borne viral disease that annually affects 100–400 million people worldwide. Early detection of dengue enables easy treatment planning and helps reduce mortality rates. This study proposes three Swarm-based Metaheuristic Algorithms, Golden Jackal Optimization, Fox Optimizer, and Sea Lion Optimization, for feature selection and hyperparameter tuning, and an Extreme Gradient Boost classifier to forecast dengue fever using the Predictive Clinical Dengue dataset. Several existing models have been proposed for dengue fever classification, with some achieving high predictive performance. However, most of these studies have overlooked the importance of feature reduction, which is crucial to building efficient and interpretable models. Furthermore, prior research has lacked in-depth analysis of model behavior, particularly regarding the underlying causes of misclassification. Addressing these limitations, this study achieved a 10-fold cross-validation mean accuracy of 99.89%, an F-score of 99.92%, a precision of 99.84%, and a perfect recall of 100% by using only two features: WBC Count and Platelet Count. Notably, FOX-XGBoost and SLO-XGBoost achieved the same performance while utilizing only four and three features, respectively, demonstrating the effectiveness of feature reduction without compromising accuracy. Among these, GJO-XGBoost demonstrated the most efficient feature utilization while maintaining superior performance, emphasizing its potential for practical deployment in dengue fever diagnosis. SHAP analysis identified WBC Count as the most influential feature driving model predictions. Furthermore, DiCE explanations support this finding by showing that lower WBC Counts are associated with dengue-positive cases, whereas higher WBC Counts are indicative of dengue-negative individuals. SHAP interpreted the reasons behind misclassifications, while DiCE provided a correction mechanism by suggesting the minimal changes needed to convert incorrect predictions into correct ones. Full article
Show Figures

Figure 1

23 pages, 35493 KB  
Article
A Novel Point-Cloud-Based Alignment Method for Shelling Tool Pose Estimation in Aluminum Electrolysis Workshop
by Zhenggui Jiang, Yi Long, Yonghong Long, Weihua Fang and Xin Li
Information 2025, 16(9), 788; https://doi.org/10.3390/info16090788 - 10 Sep 2025
Viewed by 81
Abstract
In aluminum electrolysis workshops, real-time pose perception of shelling heads is crucial to process accuracy and equipment safety. However, due to high temperatures, smoke, dust, and metal obstructions, traditional pose estimation methods struggle to achieve high accuracy and robustness. At the same time, [...] Read more.
In aluminum electrolysis workshops, real-time pose perception of shelling heads is crucial to process accuracy and equipment safety. However, due to high temperatures, smoke, dust, and metal obstructions, traditional pose estimation methods struggle to achieve high accuracy and robustness. At the same time, the continuous movement of the shelling head and the similar geometric structures around it make it hard to match point-clouds, which makes it even harder to track the position and orientation. In response to the above challenges, we propose a multi-stage optimization pose estimation algorithm based on point-cloud processing. This method is designed for dynamic perception tasks of three-dimensional components in complex industrial scenarios. First stage improves the accuracy of initial matching by combining a weighted 3D Hough voting and adaptive threshold mechanism with an improved FPFH feature matching strategy. In the second stage, by integrating FPFH and PCA feature information, a stable initial registration is achieved using the RANSAC-IA coarse registration framework. In the third stage, we designed an improved ICP algorithm that effectively improved the convergence of the registration process and the accuracy of the final pose estimation. The experimental results show that the proposed method has good robustness and adaptability in a real electrolysis workshop environment, and can achieve pose estimation of the shelling head in the presence of noise, occlusion, and complex background interference. Full article
(This article belongs to the Special Issue Advances in Computer Graphics and Visual Computing)
Show Figures

Figure 1

37 pages, 626 KB  
Systematic Review
Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review
by Josep Hornos-Arias, Sergi Grau and Josep M. Serra-Grabulosa
Information 2025, 16(9), 787; https://doi.org/10.3390/info16090787 - 10 Sep 2025
Viewed by 124
Abstract
Developmental dyscalculia is a neurobiologically based learning disorder that impairs numerical processing and calculation abilities. Numerous studies underscore the critical importance of early detection to enable effective intervention, highlighting the need for individualized, structured, and adaptive approaches. Digital tools, particularly those based on [...] Read more.
Developmental dyscalculia is a neurobiologically based learning disorder that impairs numerical processing and calculation abilities. Numerous studies underscore the critical importance of early detection to enable effective intervention, highlighting the need for individualized, structured, and adaptive approaches. Digital tools, particularly those based on serious games, appear to offer a promising level of personalization. This systematic review aims to evaluate the relevance of serious game-based digital solutions as tools for the detection and remediation of developmental dyscalculia in children aged 5 to 12 years. To provide readers with a comprehensive understanding of this field, the selected solutions were analyzed and classified according to the technologies employed (including emerging ones), their thematic focus, the mathematical abilities targeted, the configuration of experimental trials, and the outcomes reported. A systematic search was conducted across Scopus, Web of Knowledge, PubMed, Eric, PsycInfo, and IEEEXplore for studies published between 2000 and March 2025, yielding 7799 records. Additional studies were identified through reference screening. A total of 21 studies met the eligibility criteria. All procedures were registered in PROSPERO and conducted in accordance with PRISMA guidelines for systematic reviews. The methodological analysis of the included studies emphasized the importance of employing both control and experimental groups with adequate sample sizes to ensure robust evaluation. In terms of remediation, the findings highlight the value of pre- and post-intervention assessments and the implementation of individualized training sessions, ideally not exceeding 20 min in duration. The review revealed a greater prevalence of remediation-focused serious games compared to screening tools, with a growing trend toward the use of mobile technologies. However, the substantial methodological limitations observed across studies must be addressed to enable the rigorous evaluation of the potential of SGs to detect and support the improvement of difficulties associated with developmental dyscalculia. Moreover, despite the recognized importance of personalization and adaptability in effective interventions, relatively few studies incorporated machine learning algorithms to enable the development of fully adaptive systems. Full article
Show Figures

Figure 1

15 pages, 770 KB  
Article
Analysis of Large Language Models for Company Annual Reports Based on Retrieval-Augmented Generation
by Abhijit Mokashi, Bennet Puthuparambil, Chaissy Daniel and Thomas Hanne
Information 2025, 16(9), 786; https://doi.org/10.3390/info16090786 - 10 Sep 2025
Viewed by 137
Abstract
Large language models (LLMs) like ChatGPT-4 and Gemini 1.0 demonstrate significant text generation capabilities but often struggle with outdated knowledge, domain specificity, and hallucinations. Retrieval-Augmented Generation (RAG) offers a promising solution by integrating external knowledge sources to produce more accurate and informed responses. [...] Read more.
Large language models (LLMs) like ChatGPT-4 and Gemini 1.0 demonstrate significant text generation capabilities but often struggle with outdated knowledge, domain specificity, and hallucinations. Retrieval-Augmented Generation (RAG) offers a promising solution by integrating external knowledge sources to produce more accurate and informed responses. This research investigates RAG’s effectiveness in enhancing LLM performance for financial report analysis. We examine how RAG and the specific prompt design improve the provision of qualitative and quantitative financial information in terms of accuracy, relevance, and verifiability. Employing a design science research approach, we compare ChatGPT-4 responses before and after RAG integration, using annual reports from ten selected technology companies. Our findings demonstrate that RAG improves the relevance and verifiability of LLM outputs (by 0.66 and 0.71, respectively, on a scale from 1 to 5), while also reducing irrelevant or incorrect answers. Prompt specificity is shown to critically impact response quality. This study indicates RAG’s potential to mitigate LLM biases and inaccuracies, offering a practical solution for generating reliable and contextually rich financial insights. Full article
Show Figures

Figure 1

20 pages, 2173 KB  
Article
Intelligent Assessment of Scientific Creativity by Integrating Data Augmentation and Pseudo-Labeling
by Weini Weng, Chang Liu, Guoli Zhao, Luwei Song and Xingli Zhang
Information 2025, 16(9), 785; https://doi.org/10.3390/info16090785 - 10 Sep 2025
Viewed by 139
Abstract
Scientific creativity is a crucial indicator of adolescents’ potential in science and technology, and its automated evaluation plays a vital role in the early identification of innovative talent. To address challenges such as limited sample sizes, high annotation costs, and modality heterogeneity, this [...] Read more.
Scientific creativity is a crucial indicator of adolescents’ potential in science and technology, and its automated evaluation plays a vital role in the early identification of innovative talent. To address challenges such as limited sample sizes, high annotation costs, and modality heterogeneity, this study proposes a multimodal assessment method that integrates data augmentation and pseudo-labeling techniques. For the first time, a joint enhancement approach is introduced that combines textual and visual data with a pseudo-labeling strategy to accommodate the characteristics of text–image integration in elementary students’ cognitive expressions. Specifically, SMOTE is employed to expand questionnaire data, EDA is used to enhance hand-drawn text–image data, and text–image semantic alignment is applied to improve sample quality. Additionally, a confidence-driven pseudo-labeling mechanism is incorporated to optimize the use of unlabeled data. Finally, multiple machine learning models are integrated to predict scientific creativity. The results demonstrate the following: 1. Data augmentation significantly increases sample diversity, and the highest accuracy of information alignment was achieved when text and images were matched. 2. The combination of data augmentation and pseudo-labeling mechanisms improves model robustness and generalization. 3. Family environment, parental education, and curiosity are key factors influencing scientific creativity. This study offers a cost-effective and efficient approach for assessing scientific creativity in elementary students and provides practical guidance for fostering their innovative potential. Full article
Show Figures

Figure 1

25 pages, 1380 KB  
Review
A Systematic Review and Experimental Evaluation of Classical and Transformer-Based Models for Urdu Abstractive Text Summarization
by Muhammad Azhar, Adeen Amjad, Deshinta Arrova Dewi and Shahreen Kasim
Information 2025, 16(9), 784; https://doi.org/10.3390/info16090784 - 9 Sep 2025
Viewed by 121
Abstract
The rapid growth of digital content in Urdu has created an urgent need for effective automatic text summarization (ATS) systems. While extractive methods have been widely studied, abstractive summarization for Urdu remains largely unexplored due to the language’s complex morphology and rich literary [...] Read more.
The rapid growth of digital content in Urdu has created an urgent need for effective automatic text summarization (ATS) systems. While extractive methods have been widely studied, abstractive summarization for Urdu remains largely unexplored due to the language’s complex morphology and rich literary tradition. This paper systematically evaluates four transformer-based language models (BERT-Urdu, BART, mT5, and GPT-2) for Urdu abstractive summarization, comparing their performance against conventional machine learning and deep learning approaches. Using multiple Urdu datasets—including the Urdu Summarization Corpus, Fake News Dataset, and Urdu-Instruct-News—we show that fine-tuned Transformer Language Models (TLMs) consistently outperform traditional methods, with the multilingual mT5 model achieving a 0.42 absolute improvement in F1-score over the best baseline. Our analysis reveals that mT5’s architecture is particularly effective at handling Urdu-specific challenges such as right-to-left script processing, diacritic interpretation, and complex verb–noun compounding. Furthermore, we present empirically validated hyperparameter configurations and training strategies for Urdu ATS, establishing transformer-based approaches as the new state-of-the-art for Urdu summarization. Notably, mT5 outperforms Seq2Seq baselines by up to 20% in ROUGE-L, underscoring the efficacy of Transformer-based models for low-resource languages. This work contributes both a systematic review of prior research and a novel empirical benchmark for advancing Urdu abstractive summarization. Full article
Show Figures

Figure 1

38 pages, 3071 KB  
Article
A Hybrid Framework for the Sensitivity Analysis of Software-Defined Networking Performance Metrics Using Design of Experiments and Machine Learning Techniques
by Chekwube Ezechi, Mobayode O. Akinsolu, Wilson Sakpere, Abimbola O. Sangodoyin, Uyoata E. Uyoata, Isaac Owusu-Nyarko and Folahanmi T. Akinsolu
Information 2025, 16(9), 783; https://doi.org/10.3390/info16090783 - 9 Sep 2025
Viewed by 226
Abstract
Software-defined networking (SDN) is a transformative approach for managing modern network architectures, particularly in Internet-of-Things (IoT) applications. However, ensuring the optimal SDN performance and security often needs a robust sensitivity analysis (SA). To complement existing SA methods, this study proposes a new SA [...] Read more.
Software-defined networking (SDN) is a transformative approach for managing modern network architectures, particularly in Internet-of-Things (IoT) applications. However, ensuring the optimal SDN performance and security often needs a robust sensitivity analysis (SA). To complement existing SA methods, this study proposes a new SA framework that integrates design of experiments (DOE) and machine-learning (ML) techniques. Although existing SA methods have been shown to be effective and scalable, most of these methods have yet to hybridize anomaly detection and classification (ADC) and data augmentation into a single, unified framework. To fill this gap, a targeted application of well-established existing techniques is proposed. This is achieved by hybridizing these existing techniques to undertake a more robust SA of a typified SDN-reliant IoT network. The proposed hybrid framework combines Latin hypercube sampling (LHS)-based DOE and generative adversarial network (GAN)-driven data augmentation to improve SA and support ADC in SDN-reliant IoT networks. Hence, it is called DOE-GAN-SA. In DOE-GAN-SA, LHS is used to ensure uniform parameter sampling, while GAN is used to generate synthetic data to augment data derived from typified real-world SDN-reliant IoT network scenarios. DOE-GAN-SA also employs a classification and regression tree (CART) to validate the GAN-generated synthetic dataset. Through the proposed framework, ADC is implemented, and an artificial neural network (ANN)-driven SA on an SDN-reliant IoT network is carried out. The performance of the SDN-reliant IoT network is analyzed under two conditions: namely, a normal operating scenario and a distributed-denial-of-service (DDoS) flooding attack scenario, using throughput, jitter, and response time as performance metrics. To statistically validate the experimental findings, hypothesis tests are conducted to confirm the significance of all the inferences. The results demonstrate that integrating LHS and GAN significantly enhances SA, enabling the identification of critical SDN parameters affecting the modeled SDN-reliant IoT network performance. Additionally, ADC is also better supported, achieving higher DDoS flooding attack detection accuracy through the incorporation of synthetic network observations that emulate real-time traffic. Overall, this work highlights the potential of hybridizing LHS-based DOE, GAN-driven data augmentation, and ANN-assisted SA for robust network behavioral analysis and characterization in a new hybrid framework. Full article
(This article belongs to the Special Issue Data Privacy Protection in the Internet of Things)
Show Figures

Graphical abstract

25 pages, 9775 KB  
Article
Opinion Formation in Wikipedia Ising Networks
by Leonardo Ermann, Klaus M. Frahm and Dima L. Shepelyansky
Information 2025, 16(9), 782; https://doi.org/10.3390/info16090782 - 9 Sep 2025
Viewed by 138
Abstract
We study the properties of opinion formation on Wikipedia Ising Networks. Each Wikipedia article is represented as a node, and links are formed by citations of one article to another, generating a directed network of a given language edition with millions of nodes. [...] Read more.
We study the properties of opinion formation on Wikipedia Ising Networks. Each Wikipedia article is represented as a node, and links are formed by citations of one article to another, generating a directed network of a given language edition with millions of nodes. Ising spins are placed at each node, and their orientation up or down is determined by a majority vote of connected neighbors. At the initial stage, there are only a few nodes from two groups with fixed competing opinions up and down, while other nodes are assumed to have no initial opinion with no effect on the vote. The competition of two opinions is modeled by an asynchronous Monte Carlo process converging to a spin-polarized steady-state phase. This phase remains stable with respect to small fluctuations induced by an effective temperature of the Monte Carlo process. The opinion polarization at the steady state provides opinion (spin) preferences for each node. In the framework of this Ising Network Opinion Formation model, we analyze the influence and competition between political leaders, world countries, and social concepts. This approach is also generalized to the competition between three groups of different opinions described by three colors; for example, Donald Trump, Vladimir Putin, and Xi Jinping, or the USA, Russia, and China, within English, Russian, and Chinese editions of Wikipedia of March 2025. We argue that this approach provides a generic description of opinion formation in various complex networks. Full article
Show Figures

Figure 1

18 pages, 614 KB  
Article
Digital Technology Deployment and Improved Corporate Performance: Evidence from the Manufacturing Sector in China
by Liwen Cheng, Rui Ma, Xihui Chen and Luca Esposito
Information 2025, 16(9), 781; https://doi.org/10.3390/info16090781 - 9 Sep 2025
Viewed by 275
Abstract
With global supply chains being reshaped and costs surging, China’s manufacturing sector faces mounting pressure to retain its position as the world’s largest manufacturing center. Meeting this challenge demands the full mobilization of digital factors, which has attracted increasing academic attention. However, limited [...] Read more.
With global supply chains being reshaped and costs surging, China’s manufacturing sector faces mounting pressure to retain its position as the world’s largest manufacturing center. Meeting this challenge demands the full mobilization of digital factors, which has attracted increasing academic attention. However, limited research has examined how the effective integration of digital factors with traditional production factors can improve corporate performance. With data on Chinese manufacturing enterprises from the A-share market, this study employs a fixed effect model and a mediating effect model to analyze how the synergies between digital and traditional factors enhance corporate performance. Further, it illustrates the heterogeneous impacts across different types of enterprises. The results reveal three key findings. First, the synergies between digital and traditional factors significantly enhance corporate performance, with digital–capital synergy proving more effective than digital–labor synergy. Second, this synergy promotes performance improvement through three primary mechanisms: strengthening internal control quality, fostering business model innovation, and increasing product differentiation. Third, the performance effects of multi-factor synergies vary considerably across enterprise types, being more pronounced in non-state-owned enterprises, firms with strong digital attributes, and firms without political connections. Overall, this study offers valuable insights for manufacturing firms seeking a competitive edge in high-end and intelligent manufacturing within an increasingly globalized competitive landscape. Full article
Show Figures

Figure 1

17 pages, 633 KB  
Article
Predicting Achievers in an Online Theatre Course Designed upon the Principles of Sustainable Education
by Stamatios Ntanos, Ioannis Georgakopoulos and Vassilis Zakopoulos
Information 2025, 16(9), 780; https://doi.org/10.3390/info16090780 - 8 Sep 2025
Viewed by 238
Abstract
The development of online courses aligned with sustainable education principles is crucial for equipping learners with 21st-century skills essential for a sustainable future. As online education expands, predicting achievers (in this research, students with a final grade of seven or higher) becomes essential [...] Read more.
The development of online courses aligned with sustainable education principles is crucial for equipping learners with 21st-century skills essential for a sustainable future. As online education expands, predicting achievers (in this research, students with a final grade of seven or higher) becomes essential for optimizing instructional strategies and improving retention rates. This study employs a Linear Discriminant Analysis (LDA) model to predict academic performance in an online theatre course rooted in sustainable education principles. Engagement metrics such as total logins and collaborative assignment completion emerged as decisive predictors, aligning with prior research emphasizing active learning and collaboration. The model demonstrated robust performance, achieving 90% accuracy, 80% specificity, and an 88% correct classification rate. These results underscore the potential of machine learning in identifying achievers while highlighting the significance of sustainable pedagogical components. Future research should explore emotional engagement indicators and multi-course validation to enhance predictive capabilities. By utilizing the e-learning system information, the presented methodology has the potential to assist institutional policymakers in enhancing learning outcomes, advancing sustainability goals, and supporting innovation across the educational and creative sectors. Full article
(This article belongs to the Special Issue Advancing Educational Innovation with Artificial Intelligence)
Show Figures

Graphical abstract

21 pages, 5562 KB  
Article
LSNet: Adaptive Latent Space Networks for Vulnerability Severity Assessment
by Yizhou Wang, Jin Zhang and Mingfeng Huang
Information 2025, 16(9), 779; https://doi.org/10.3390/info16090779 - 8 Sep 2025
Viewed by 290
Abstract
Due to the increasing harmfulness of software vulnerabilities, it is increasingly suggested to propose more efficient vulnerability assessment methods. However, existing methods mainly rely on manual updates and inefficient rule matching, and they struggle to capture potential correlations between vulnerabilities, thus resulting in [...] Read more.
Due to the increasing harmfulness of software vulnerabilities, it is increasingly suggested to propose more efficient vulnerability assessment methods. However, existing methods mainly rely on manual updates and inefficient rule matching, and they struggle to capture potential correlations between vulnerabilities, thus resulting in issues such as strong subjectivity and low efficiency. To this end, a vulnerability severity assessment method named Latent Space Networks (LSNet) is proposed in this paper. Specifically, based on a clustering analysis in Common Vulnerability Scoring System (CVSS) metrics, we first exploit relations for CVSS metrics prediction and propose an adaptive transformer to extract vulnerability from both global semantic and local latent space features. Then, we utilize bidirectional encoding and token masking techniques to enhance the model’s understanding of vulnerability–location relationships, and combine the Transformer method with convolution to significantly improve the model’s ability to identify vulnerable text. Finally, extensive experiments conducted on the open vulnerability dataset and the CCF OSC2024 dataset demonstrate that LSNet is capable of extracting potential correlation features. Compared with baseline methods, including SVM, Transformer, TextCNN, BERT, DeBERTa, ALBERT, and RoBERTa, it exhibits higher accuracy and efficiency. Full article
(This article belongs to the Topic Addressing Security Issues Related to Modern Software)
Show Figures

Figure 1

33 pages, 1260 KB  
Review
Identity Management Systems: A Comprehensive Review
by Zhengze Feng, Ziyi Li, Hui Cui and Monica T. Whitty
Information 2025, 16(9), 778; https://doi.org/10.3390/info16090778 - 8 Sep 2025
Viewed by 319
Abstract
Blockchain technology has introduced new paradigms for identity management systems (IDMSs), enabling users to regain control over their identity data and reduce reliance on centralized authorities. In recent years, numerous blockchain-based IDMS solutions have emerged across both practical application domains and academic research. [...] Read more.
Blockchain technology has introduced new paradigms for identity management systems (IDMSs), enabling users to regain control over their identity data and reduce reliance on centralized authorities. In recent years, numerous blockchain-based IDMS solutions have emerged across both practical application domains and academic research. However, prior reviews often focus on single application areas, provide limited cross-domain comparison, and insufficiently address security challenges such as interoperability, revocation, and quantum resilience. This paper bridges these gaps by presenting the first comprehensive survey that examines IDMSs from three complementary perspectives: (i) historical evolution from centralized and federated models to blockchain-based decentralized architectures; (ii) a cross-domain taxonomy of blockchain-based IDMSs, encompassing both general-purpose designs and domain-specific implementations; and (iii) a security analysis of threats across the full identity lifecycle. Drawing on a systematic review of 47 studies published between 2019 and 2025 and conducted in accordance with the PRISMA methodology, the paper synthesizes academic research and real-world deployments to identify unresolved technical, economic, and social challenges, and to outline directions for future research. The findings aim to serve as a timely reference for both researchers and practitioners working toward secure, interoperable, and sustainable blockchain-based IDMSs. Full article
Show Figures

Figure 1

19 pages, 2646 KB  
Article
A Comprehensive Study of MCS-TCL: Multi-Functional Sampling for Trustworthy Compressive Learning
by Fuma Kimishima, Jian Yang and Jinjia Zhou
Information 2025, 16(9), 777; https://doi.org/10.3390/info16090777 - 7 Sep 2025
Viewed by 212
Abstract
Compressive Learning (CL) is an emerging paradigm that allows machine learning models to perform inference directly from compressed measurements, significantly reducing sensing and computational costs. While existing CL approaches have achieved competitive accuracy compared to traditional image-domain methods, they typically rely on reconstruction [...] Read more.
Compressive Learning (CL) is an emerging paradigm that allows machine learning models to perform inference directly from compressed measurements, significantly reducing sensing and computational costs. While existing CL approaches have achieved competitive accuracy compared to traditional image-domain methods, they typically rely on reconstruction to address information loss and often neglect uncertainty arising from ambiguous or insufficient data. In this work, we propose MCS-TCL, a novel and trustworthy CL framework based on Multi-functional Compressive Sensing Sampling. Our approach unifies sampling, compression, and feature extraction into a single operation by leveraging the compatibility between compressive sensing and convolutional feature learning. This joint design enables efficient signal acquisition while preserving discriminative information, leading to feature representations that remain robust across varying sampling ratios. To enhance the model’s reliability, we incorporate evidential deep learning (EDL) during training. EDL estimates the distribution of evidence over output classes, enabling the model to quantify predictive uncertainty and assign higher confidence to well-supported predictions. Extensive experiments on image classification tasks show that MCS-TCL outperforms existing CL methods, achieving state-of-the-art accuracy at a low sampling rate of 6%. Additionally, our framework reduces model size by 85.76% while providing meaningful uncertainty estimates, demonstrating its effectiveness in resource-constrained learning scenarios. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Figure 1

15 pages, 3574 KB  
Article
Prior Knowledge Shapes Success When Large Language Models Are Fine-Tuned for Biomedical Term Normalization
by Daniel B. Hier, Steven K. Platt and Anh Nguyen
Information 2025, 16(9), 776; https://doi.org/10.3390/info16090776 - 7 Sep 2025
Viewed by 247
Abstract
Large language models (LLMs) often fail to correctly associate biomedical terms with their standardized ontology identifiers, posing challenges for downstream applications that rely on accurate, machine-readable codes. These linking failures can compromise the integrity of data used in precision medicine, clinical decision support, [...] Read more.
Large language models (LLMs) often fail to correctly associate biomedical terms with their standardized ontology identifiers, posing challenges for downstream applications that rely on accurate, machine-readable codes. These linking failures can compromise the integrity of data used in precision medicine, clinical decision support, and population health. Fine-tuning can partially remedy these issues, but the degree of improvement varies across terms and terminologies. Focusing on the Human Phenotype Ontology (HPO), we show that a model’s prior knowledge of term–identifier pairs, acquired during pre-training, strongly predicts whether fine-tuning will enhance its linking accuracy. We evaluate prior knowledge in three complementary ways: (1) latent probabilistic knowledge, revealed through stochastic prompting, captures hidden associations not evident in deterministic output; (2) partial subtoken knowledge, reflected in incomplete but non-random generation of identifier components; and (3) term familiarity, inferred from annotation frequencies in the biomedical literature, which serve as a proxy for training exposure. We then assess how these forms of prior knowledge influence the accuracy of deterministic identifier linking. Fine-tuning performance varies most for terms in what we call the reactive middle zone of the ontology—terms with intermediate levels of prior knowledge that are neither absent nor fully consolidated. Fine-tuning was most successful when prior knowledge as measured by partial subtoken knowledge, was ‘weak’ or ‘medium’ or when prior knowledge as measured by latent probabilistic knowledge was ‘unknown’ or ‘weak’ (p<0.001). These terms from the ‘reactive middle’ exhibited the largest gains or losses in accuracy during fine-tuning, suggesting that the success of knowledge injection critically depends on the level of term–identifier pair knowledge in the LLM before fine-tuning. Full article
Show Figures

Figure 1

20 pages, 15996 KB  
Article
A Gramian Angular Field-Based Convolutional Neural Network Approach for Crack Detection in Low-Power Turbines from Vibration Signals
by Angel H. Rangel-Rodriguez, Juan P. Amezquita-Sanchez, David Granados-Lieberman, David Camarena-Martinez, Maximiliano Bueno-Lopez and Martin Valtierra-Rodriguez
Information 2025, 16(9), 775; https://doi.org/10.3390/info16090775 - 6 Sep 2025
Viewed by 431
Abstract
The detection of damage in wind turbine blades is critical for ensuring their operational efficiency and longevity. This study presents a novel method for wind turbine blade damage detection, utilizing Gramian Angular Field (GAF) transformations of vibration signals in combination with Convolutional Neural [...] Read more.
The detection of damage in wind turbine blades is critical for ensuring their operational efficiency and longevity. This study presents a novel method for wind turbine blade damage detection, utilizing Gramian Angular Field (GAF) transformations of vibration signals in combination with Convolutional Neural Networks (CNNs). The GAF method enables the transformation of vibration signals, which are captured using a triaxial accelerometer, into angular representations that preserve temporal dependencies and reveal distinctive texture patterns that can be associated with structural damage. This transformation facilitates the capability of CNNs to identify complex features correlated with crack severity in wind turbine blades, thereby enhancing the precision and effectiveness of turbine fault diagnosis. The GAF-CNN model achieved a notable classification accuracy over 99.9%, demonstrating its robustness and potential for automated damage detection. Unlike traditional methods, which rely on expert interpretation and are sensitive to noise, the proposed system offers a more efficient and precise tool for damage monitoring. The findings suggest that this method can significantly enhance wind turbine condition monitoring systems, offering reduced dependency on manual inspections and improving early detection capabilities. Full article
(This article belongs to the Special Issue Signal Processing Based on Machine Learning Techniques)
Show Figures

Figure 1

21 pages, 371 KB  
Article
A Generalized Method for Filtering Noise in Open-Source Project Selection
by Yi Ding, Qing Fang and Xiaoyan Liu
Information 2025, 16(9), 774; https://doi.org/10.3390/info16090774 - 6 Sep 2025
Viewed by 279
Abstract
GitHub hosts over 10 million repositories, providing researchers with vast opportunities to study diverse software engineering problems. However, as anyone can create a repository for any purpose at no cost, open-source platforms contain many non-cooperative or non-developmental noise projects (e.g., repositories of dotfiles). [...] Read more.
GitHub hosts over 10 million repositories, providing researchers with vast opportunities to study diverse software engineering problems. However, as anyone can create a repository for any purpose at no cost, open-source platforms contain many non-cooperative or non-developmental noise projects (e.g., repositories of dotfiles). When selecting open-source projects for analysis, mixing collaborative coding projects (e.g., machine learning frameworks) with noisy projects may bias research findings. To solve this problem, we optimize the Semi-Automatic Decision Tree Method (SADTM), an existing Collaborative Coding Project (CCP) classification method, to improve its generality and accuracy. We evaluate our method on the GHTorrent dataset (2012–2020) and find that it effectively enhances CCP classification in two key ways: (1) it demonstrates greater stability than existing methods, yielding consistent results across different datasets; (2) it achieves high precision, with an F-measure ranging from 0.780 to 0.893. Our method outperforms existing techniques in filtering noise and selecting CCPs, enabling researchers to extract high-quality open-source projects from candidate samples with reliable accuracy. Full article
(This article belongs to the Topic Software Engineering and Applications)
Show Figures

Figure 1

Previous Issue
Back to TopTop