Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,122)

Search Parameters:
Keywords = multimodal deep learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 5986 KB  
Article
A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities
by Guo Wei and Yan Liu
Entropy 2025, 27(9), 921; https://doi.org/10.3390/e27090921 (registering DOI) - 31 Aug 2025
Abstract
Accurate genome binning is essential for resolving microbial community structure and functional potential from metagenomic data. However, existing approaches—primarily reliant on tetranucleotide frequency (TNF) and abundance profiles—often perform sub-optimally in the face of complex community compositions, low-abundance taxa, and long-read sequencing datasets. To [...] Read more.
Accurate genome binning is essential for resolving microbial community structure and functional potential from metagenomic data. However, existing approaches—primarily reliant on tetranucleotide frequency (TNF) and abundance profiles—often perform sub-optimally in the face of complex community compositions, low-abundance taxa, and long-read sequencing datasets. To address these limitations, we present MBGCCA, a novel metagenomic binning framework that synergistically integrates graph neural networks (GNNs), contrastive learning, and information-theoretic regularization to enhance binning accuracy, robustness, and biological coherence. MBGCCA operates in two stages: (1) multimodal information integration, where TNF and abundance profiles are fused via a deep neural network trained using a multi-view contrastive loss, and (2) self-supervised graph representation learning, which leverages assembly graph topology to refine contig embeddings. The contrastive learning objective follows the InfoMax principle by maximizing mutual information across augmented views and modalities, encouraging the model to extract globally consistent and high-information representations. By aligning perturbed graph views while preserving topological structure, MBGCCA effectively captures both global genomic characteristics and local contig relationships. Comprehensive evaluations using both synthetic and real-world datasets—including wastewater and soil microbiomes—demonstrate that MBGCCA consistently outperforms state-of-the-art binning methods, particularly in challenging scenarios marked by sparse data and high community complexity. These results highlight the value of entropy-aware, topology-preserving learning for advancing metagenomic genome reconstruction. Full article
(This article belongs to the Special Issue Network-Based Machine Learning Approaches in Bioinformatics)
22 pages, 2691 KB  
Article
A Short-Term Load Forecasting Method for Typical High Energy-Consuming Industrial Parks Based on Multimodal Decomposition and Hybrid Neural Networks
by Jingyu Li, Yu Shi, Na Zhang and Yuanyu Chen
Appl. Sci. 2025, 15(17), 9578; https://doi.org/10.3390/app15179578 (registering DOI) - 30 Aug 2025
Abstract
High energy-consuming industrial parks are characterized by high base-load-to-peak-valley ratios, overlapping production cycles, and megawatt-scale step changes, which significantly complicate short-term load forecasting. To tackle these challenges, this study proposes a novel forecasting framework that combines hierarchical multimodal decomposition with a hybrid deep [...] Read more.
High energy-consuming industrial parks are characterized by high base-load-to-peak-valley ratios, overlapping production cycles, and megawatt-scale step changes, which significantly complicate short-term load forecasting. To tackle these challenges, this study proposes a novel forecasting framework that combines hierarchical multimodal decomposition with a hybrid deep learning architecture. First, Maximal Information Coefficient (MIC) analysis is applied to identify key input features and eliminate redundancy. The load series is then decomposed in two stages: seasonal-trend decomposition uses the Loess (STL) isolates trend and seasonal components, while variational mode decomposition (VMD) further disaggregates the residual into multi-scale modes. This hierarchical approach enhances signal clarity and preserves temporal structure. A parallel neural architecture is subsequently developed, integrating an Informer network to model long-term trends and a bidirectional gated recurrent unit (BiGRU) to capture short-term fluctuations. Case studies based on real-world load data from a typical industrial park in northeastern China demonstrate that the proposed model achieves significantly improved forecasting accuracy and robustness compared to benchmark methods. These results provide strong technical support for fine-grained load prediction and intelligent dispatch in high energy-consuming industrial scenarios. Full article
Show Figures

Figure 1

30 pages, 2138 KB  
Review
A SPAR-4-SLR Systematic Review of AI-Based Traffic Congestion Detection: Model Performance Across Diverse Data Types
by Doha Bakir, Khalid Moussaid, Zouhair Chiba, Noreddine Abghour and Amina El omri
Smart Cities 2025, 8(5), 143; https://doi.org/10.3390/smartcities8050143 (registering DOI) - 30 Aug 2025
Abstract
Traffic congestion remains a major urban challenge, impacting economic productivity, environmental sustainability, and commuter well-being. This systematic review investigates how artificial intelligence (AI) techniques contribute to detecting traffic congestion. Following the SPAR-4-SLR protocol, we analyzed 44 peer-reviewed studies covering three data categories—spatiotemporal, probe, [...] Read more.
Traffic congestion remains a major urban challenge, impacting economic productivity, environmental sustainability, and commuter well-being. This systematic review investigates how artificial intelligence (AI) techniques contribute to detecting traffic congestion. Following the SPAR-4-SLR protocol, we analyzed 44 peer-reviewed studies covering three data categories—spatiotemporal, probe, and hybrid/multimodal—and four AI model types—shallow machine learning (SML), deep learning (DL), probabilistic reasoning (PR), and hybrid approaches. Each model category was evaluated against metrics such as accuracy, the F1-score, computational efficiency, and deployment feasibility. Our findings reveal that SML techniques, particularly decision trees combined with optical flow, are optimal for real-time, low-resource applications. CNN-based DL models excel in handling unstructured and variable environments, while hybrid models offer improved robustness through multimodal data fusion. Although PR methods are less common, they add value when integrated with other paradigms to address uncertainty. This review concludes that no single AI approach is universally the best; rather, model selection should be aligned with the data type, application context, and operational constraints. This study offers actionable guidance for researchers and practitioners aiming to build scalable, context-aware AI systems for intelligent traffic management. Full article
(This article belongs to the Special Issue Cost-Effective Transportation Planning for Smart Cities)
16 pages, 1500 KB  
Article
Emotion Recognition in Autistic Children Through Facial Expressions Using Advanced Deep Learning Architectures
by Petra Radočaj and Goran Martinović
Appl. Sci. 2025, 15(17), 9555; https://doi.org/10.3390/app15179555 (registering DOI) - 30 Aug 2025
Abstract
Atypical and subtle facial expression patterns in individuals with autism spectrum disorder (ASD) pose a significant challenge for automated emotion recognition. This study evaluates and compares the performance of convolutional neural networks (CNNs) and transformer-based deep learning models for facial emotion recognition in [...] Read more.
Atypical and subtle facial expression patterns in individuals with autism spectrum disorder (ASD) pose a significant challenge for automated emotion recognition. This study evaluates and compares the performance of convolutional neural networks (CNNs) and transformer-based deep learning models for facial emotion recognition in this population. Using a labeled dataset of emotional facial images, we assessed eight models across four emotion categories: natural, anger, fear, and joy. Our results demonstrate that transformer models consistently outperformed CNNs in both overall and emotion-specific metrics. Notably, the Swin Transformer achieved the highest performance, with an accuracy of 0.8000 and an F1-score of 0.7889, significantly surpassing all CNN counterparts. While CNNs failed to detect the fear class, transformer models showed a measurable capability in identifying complex emotions such as anger and fear, suggesting an enhanced ability to capture subtle facial cues. Analysis of the confusion matrix further confirmed the transformers’ superior classification balance and generalization. Despite these promising results, the study has limitations, including class imbalance and its reliance solely on facial imagery. Future work should explore multimodal emotion recognition, model interpretability, and personalization for real-world applications. Research also demonstrates the potential of transformer architectures in advancing inclusive, emotion-aware AI systems tailored for autistic individuals. Full article
Show Figures

Figure 1

27 pages, 1211 KB  
Article
Robust Supervised Deep Discrete Hashing for Cross-Modal Retrieval
by Xiwei Dong, Fei Wu, Junqiu Zhai, Fei Ma, Guangxing Wang, Tao Liu, Xiaogang Dong and Xiao-Yuan Jing
Technologies 2025, 13(9), 383; https://doi.org/10.3390/technologies13090383 - 29 Aug 2025
Abstract
The exponential growth of multi-modal data in the real world poses significant challenges to efficient retrieval, and traditional single-modal methods are no longer suitable for the growth of multi-modal data. To address this issue, hashing retrieval methods play an important role in cross-modal [...] Read more.
The exponential growth of multi-modal data in the real world poses significant challenges to efficient retrieval, and traditional single-modal methods are no longer suitable for the growth of multi-modal data. To address this issue, hashing retrieval methods play an important role in cross-modal retrieval tasks when referring to a large amount of multi-modal data. However, effectively embedding multi-modal data into a common low-dimensional Hamming space remains challenging. A critical issue is that feature redundancies in existing methods lead to suboptimal hash codes, severely degrading retrieval performance; yet, selecting optimal features remains an open problem in deep cross-modal hashing. In this paper, we propose an end-to-end approach, named Robust Supervised Deep Discrete Hashing (RSDDH), which can accomplish feature learning and hashing learning simultaneously. RSDDH has a hybrid deep architecture consisting of a convolutional neural network and a multilayer perceptron adaptively learning modality-specific representations. Moreover, it utilizes a non-redundant feature selection strategy to select optimal features for generating discriminative hash codes. Furthermore, it employs a direct discrete hashing scheme (SVDDH) to solve the binary constraint optimization problem without relaxation, fully preserving the intrinsic properties of hash codes. Additionally, RSDDH employs inter-modal and intra-modal consistency preservation strategies to reduce the gap between modalities and improve the discriminability of learned Hamming space. Extensive experiments on four benchmark datasets demonstrate that RSDDH significantly outperforms state-of-the-art cross-modal hashing methods. Full article
(This article belongs to the Special Issue Image Analysis and Processing)
47 pages, 2691 KB  
Systematic Review
Buzzing with Intelligence: A Systematic Review of Smart Beehive Technologies
by Josip Šabić, Toni Perković, Petar Šolić and Ljiljana Šerić
Sensors 2025, 25(17), 5359; https://doi.org/10.3390/s25175359 - 29 Aug 2025
Abstract
Smart-beehive technologies represent a paradigm shift in beekeeping, transitioning from traditional, reactive methods toward proactive, data-driven management. This systematic literature review investigates the current landscape of intelligent systems applied to beehives, focusing on the integration of IoT-based monitoring, sensor modalities, machine learning techniques, [...] Read more.
Smart-beehive technologies represent a paradigm shift in beekeeping, transitioning from traditional, reactive methods toward proactive, data-driven management. This systematic literature review investigates the current landscape of intelligent systems applied to beehives, focusing on the integration of IoT-based monitoring, sensor modalities, machine learning techniques, and their applications in precision apiculture. The review adheres to PRISMA guidelines and analyzes 135 peer-reviewed publications identified through searches of Web of Science, IEEE Xplore, and Scopus between 1990 and 2025. It addresses key research questions related to the role of intelligent systems in early problem detection, hive condition monitoring, and predictive intervention. Common sensor types include environmental, acoustic, visual, and structural modalities, each supporting diverse functional goals such as health assessment, behavior analysis, and forecasting. A notable trend toward deep learning, computer vision, and multimodal sensor fusion is evident, particularly in applications involving disease detection and colony behavior modeling. Furthermore, the review highlights a growing corpus of publicly available datasets critical for the training and evaluation of machine learning models. Despite the promising developments, challenges remain in system integration, dataset standardization, and large-scale deployment. This review offers a comprehensive foundation for the advancement of smart apiculture technologies, aiming to improve colony health, productivity, and resilience in increasingly complex environmental conditions. Full article
Show Figures

Figure 1

24 pages, 1689 KB  
Article
Safeguarding Brand and Platform Credibility Through AI-Based Multi-Model Fake Profile Detection
by Vishwas Chakranarayan, Fadheela Hussain, Fayzeh Abdulkareem Jaber, Redha J. Shaker and Ali Rizwan
Future Internet 2025, 17(9), 391; https://doi.org/10.3390/fi17090391 - 29 Aug 2025
Viewed by 68
Abstract
The proliferation of fake profiles on social media presents critical cybersecurity and misinformation challenges, necessitating robust and scalable detection mechanisms. Such profiles weaken consumer trust, reduce user engagement, and ultimately harm brand reputation and platform credibility. As adversarial tactics and synthetic identity generation [...] Read more.
The proliferation of fake profiles on social media presents critical cybersecurity and misinformation challenges, necessitating robust and scalable detection mechanisms. Such profiles weaken consumer trust, reduce user engagement, and ultimately harm brand reputation and platform credibility. As adversarial tactics and synthetic identity generation evolve, traditional rule-based and machine learning approaches struggle to detect evolving and deceptive behavioral patterns embedded in dynamic user-generated content. This study aims to develop an AI-driven, multi-modal deep learning-based detection system for identifying fake profiles that fuses textual, visual, and social network features to enhance detection accuracy. It also seeks to ensure scalability, adversarial robustness, and real-time threat detection capabilities suitable for practical deployment in industrial cybersecurity environments. To achieve these objectives, the current study proposes an integrated AI system that combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) for deep semantic textual analysis, ConvNeXt for high-resolution profile image verification, and Heterogeneous Graph Attention Networks (Hetero-GAT) for modeling complex social interactions. The extracted features from all three modalities are fused through an attention-based late fusion strategy, enhancing interpretability, robustness, and cross-modal learning. Experimental evaluations on large-scale social media datasets demonstrate that the proposed RoBERTa-ConvNeXt-HeteroGAT model significantly outperforms baseline models, including Support Vector Machine (SVM), Random Forest, and Long Short-Term Memory (LSTM). Performance achieves 98.9% accuracy, 98.4% precision, and a 98.6% F1-score, with a per-profile speed of 15.7 milliseconds, enabling real-time applicability. Moreover, the model proves to be resilient against various types of attacks on text, images, and network activity. This study advances the application of AI in cybersecurity by introducing a highly interpretable, multi-modal detection system that strengthens digital trust, supports identity verification, and enhances the security of social media platforms. This alignment of technical robustness with brand trust highlights the system’s value not only in cybersecurity but also in sustaining platform credibility and consumer confidence. This system provides practical value to a wide range of stakeholders, including platform providers, AI researchers, cybersecurity professionals, and public sector regulators, by enabling real-time detection, improving operational efficiency, and safeguarding online ecosystems. Full article
Show Figures

Figure 1

29 pages, 11689 KB  
Article
Enhanced Breast Cancer Diagnosis Using Multimodal Feature Fusion with Radiomics and Transfer Learning
by Nazmul Ahasan Maruf, Abdullah Basuhail and Muhammad Umair Ramzan
Diagnostics 2025, 15(17), 2170; https://doi.org/10.3390/diagnostics15172170 - 28 Aug 2025
Viewed by 268
Abstract
Background: Breast cancer remains a critical public health problem worldwide and is a leading cause of cancer-related mortality. Optimizing clinical outcomes is contingent upon the early and precise detection of malignancies. Advances in medical imaging and artificial intelligence (AI), particularly in the fields [...] Read more.
Background: Breast cancer remains a critical public health problem worldwide and is a leading cause of cancer-related mortality. Optimizing clinical outcomes is contingent upon the early and precise detection of malignancies. Advances in medical imaging and artificial intelligence (AI), particularly in the fields of radiomics and deep learning (DL), have contributed to improvements in early detection methodologies. Nonetheless, persistent challenges, including limited data availability, model overfitting, and restricted generalization, continue to hinder performance. Methods: This study aims to overcome existing challenges by improving model accuracy and robustness through enhanced data augmentation and the integration of radiomics and deep learning features from the CBIS-DDSM dataset. To mitigate overfitting and improve model generalization, data augmentation techniques were applied. The PyRadiomics library was used to extract radiomics features, while transfer learning models were employed to derive deep learning features from the augmented training dataset. For radiomics feature selection, we compared multiple supervised feature selection methods, including RFE with random forest and logistic regression, ANOVA F-test, LASSO, and mutual information. Embedded methods with XGBoost, LightGBM, and CatBoost for GPUs were also explored. Finally, we integrated radiomics and deep features to build a unified multimodal feature space for improved classification performance. Based on this integrated set of radiomics and deep learning features, 13 pre-trained transfer learning models were trained and evaluated, including various versions of ResNet (50, 50V2, 101, 101V2, 152, 152V2), DenseNet (121, 169, 201), InceptionV3, MobileNet, and VGG (16, 19). Results: Among the evaluated models, ResNet152 achieved the highest classification accuracy of 97%, demonstrating the potential of this approach to enhance diagnostic precision. Other models, including VGG19, ResNet101V2, and ResNet101, achieved 96% accuracy, emphasizing the importance of the selected feature set in achieving robust detection. Conclusions: Future research could build on this work by incorporating Vision Transformer (ViT) architectures and leveraging multimodal data (e.g., clinical data, genomic information, and patient history). This could improve predictive performance and make the model more robust and adaptable to diverse data types. Ultimately, this approach has the potential to transform breast cancer detection, making it more accurate and interpretable. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

44 pages, 3439 KB  
Review
Conventional to Deep Learning Methods for Hyperspectral Unmixing: A Review
by Jinlin Zou, Hongwei Qu and Peng Zhang
Remote Sens. 2025, 17(17), 2968; https://doi.org/10.3390/rs17172968 - 27 Aug 2025
Viewed by 425
Abstract
Hyperspectral images often contain many mixed pixels, primarily resulting from their inherent complexity and low spatial resolution. To enhance surface classification and improve sub-pixel target detection accuracy, hyperspectral unmixing technology has consistently become a topical issue. This review provides a comprehensive overview of [...] Read more.
Hyperspectral images often contain many mixed pixels, primarily resulting from their inherent complexity and low spatial resolution. To enhance surface classification and improve sub-pixel target detection accuracy, hyperspectral unmixing technology has consistently become a topical issue. This review provides a comprehensive overview of methodologies for hyperspectral unmixing, from traditional to advanced deep learning approaches. A systematic analysis of various challenges is presented, clarifying underlying principles and evaluating the strengths and limitations of prevalent algorithms. Hyperspectral unmixing is critical for interpreting spectral imagery but faces significant challenges: limited ground-truth data, spectral variability, nonlinear mixing effects, computational demands, and barriers to practical commercialization. Future progress requires bridging the gap to applications through user-centric solutions and integrating multi-modal and multi-temporal data. Research priorities include uncertainty quantification, transfer learning for generalization, neuromorphic edge computing, and developing tuning-free foundation models for cross-scenario robustness. This paper is designed to foster the commercial application of hyperspectral unmixing algorithms and to offer robust support for engineering applications within the hyperspectral remote sensing domain. Full article
(This article belongs to the Special Issue Artificial Intelligence in Hyperspectral Remote Sensing Data Analysis)
Show Figures

Figure 1

19 pages, 2102 KB  
Article
Multi-Modal Time-Frequency Image Fusion for Weak Target Detection on Sea Surface
by Han Wu, Hongyan Xing, Mengjie Li and Chenyu Hang
J. Mar. Sci. Eng. 2025, 13(9), 1625; https://doi.org/10.3390/jmse13091625 - 26 Aug 2025
Viewed by 237
Abstract
Aiming at the problem of harrowing target feature extraction for one-dimensional radar signals in the strong sea clutter background, this paper proposes a weak target detection method based on the combination of multi-modal time-frequency map fusion and deep learning in the sea clutter [...] Read more.
Aiming at the problem of harrowing target feature extraction for one-dimensional radar signals in the strong sea clutter background, this paper proposes a weak target detection method based on the combination of multi-modal time-frequency map fusion and deep learning in the sea clutter background. The one-dimensional signal is converted into three gray-scale maps with complementary characteristics by three signal processing methods: normalized continuous wavelet transform, Normalized Smooth Pseudo Wigner-Ville Distribution, and recurrence plot; the resulting two-dimensional grayscale maps are adaptively mapped to the R, G, and B channels through an adaptive weighting matrix for feature fusion, ultimately generating a fused color image. Subsequently, an improved multi-modal EfficientNetV2s classification framework was constructed, wherein the decision threshold of the Softmax layer was optimized to achieve controllable false alarm rates for weak signal detection. Experiments are carried out on the IPIX dataset and the China Yantai dataset, and the proposed method achieves certain improvement in detection performance compared with existing detection methods. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

34 pages, 945 KB  
Review
Artificial Intelligence in Ocular Transcriptomics: Applications of Unsupervised and Supervised Learning
by Catherine Lalman, Yimin Yang and Janice L. Walker
Cells 2025, 14(17), 1315; https://doi.org/10.3390/cells14171315 - 26 Aug 2025
Viewed by 339
Abstract
Transcriptomic profiling is a powerful tool for dissecting the cellular and molecular complexity of ocular tissues, providing insights into retinal development, corneal disease, macular degeneration, and glaucoma. With the expansion of microarray, bulk RNA sequencing (RNA-seq), and single-cell RNA-seq technologies, artificial intelligence (AI) [...] Read more.
Transcriptomic profiling is a powerful tool for dissecting the cellular and molecular complexity of ocular tissues, providing insights into retinal development, corneal disease, macular degeneration, and glaucoma. With the expansion of microarray, bulk RNA sequencing (RNA-seq), and single-cell RNA-seq technologies, artificial intelligence (AI) has emerged as a key strategy for analyzing high-dimensional gene expression data. This review synthesizes AI-enabled transcriptomic studies in ophthalmology from 2019 to 2025, highlighting how supervised and unsupervised machine learning (ML) methods have advanced biomarker discovery, cell type classification, and eye development and ocular disease modeling. Here, we discuss unsupervised techniques, such as principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), and weighted gene co-expression network analysis (WGCNA), now the standard in single-cell workflows. Supervised approaches are also discussed, including the least absolute shrinkage and selection operator (LASSO), support vector machines (SVMs), and random forests (RFs), and their utility in identifying diagnostic and prognostic markers in age-related macular degeneration (AMD), diabetic retinopathy (DR), glaucoma, keratoconus, thyroid eye disease, and posterior capsule opacification (PCO), as well as deep learning frameworks, such as variational autoencoders and neural networks that support multi-omics integration. Despite challenges in interpretability and standardization, explainable AI and multimodal approaches offer promising avenues for advancing precision ophthalmology. Full article
Show Figures

Figure 1

28 pages, 2252 KB  
Review
Technical Review: Architecting an AI-Driven Decision Support System for Enhanced Online Learning and Assessment
by Saipunidzam Mahamad, Yi Han Chin, Nur Izzah Nasuha Zulmuksah, Md Mominul Haque, Muhammad Shaheen and Kanwal Nisar
Future Internet 2025, 17(9), 383; https://doi.org/10.3390/fi17090383 - 26 Aug 2025
Viewed by 319
Abstract
The rapid expansion of online learning platforms has necessitated advanced systems to address scalability, personalization, and assessment challenges. This paper presents a comprehensive review of artificial intelligence (AI)-based decision support systems (DSSs) designed for online learning and assessment, synthesizing advancements from 2020 to [...] Read more.
The rapid expansion of online learning platforms has necessitated advanced systems to address scalability, personalization, and assessment challenges. This paper presents a comprehensive review of artificial intelligence (AI)-based decision support systems (DSSs) designed for online learning and assessment, synthesizing advancements from 2020 to 2025. By integrating machine learning, natural language processing, knowledge-based systems, and deep learning, AI-DSSs enhance educational outcomes through predictive analytics, automated grading, and personalized learning paths. This study examines system architecture, data requirements, model selection, and user-centric design, emphasizing their roles in achieving scalability and inclusivity. Through case studies of a MOOC platform using NLP and an adaptive learning system employing reinforcement learning, this paper highlights significant improvements in grading efficiency (up to 70%) and student performance (12–20% grade increases). Performance metrics, including accuracy, response time, and user satisfaction, are analyzed alongside evaluation frameworks combining quantitative and qualitative approaches. Technical challenges, such as model interpretability and bias, ethical concerns like data privacy, and implementation barriers, including cost and adoption resistance, are critically assessed, with proposed mitigation strategies. Future directions explore generative AI, multimodal integration, and cross-cultural studies to enhance global accessibility. This review offers a robust framework for researchers and practitioners, providing actionable insights for designing equitable, efficient, and scalable AI-DSSs to transform online education. Full article
(This article belongs to the Special Issue Generative Artificial Intelligence in Smart Societies)
Show Figures

Figure 1

31 pages, 3129 KB  
Review
A Review on Gas Pipeline Leak Detection: Acoustic-Based, OGI-Based, and Multimodal Fusion Methods
by Yankun Gong, Chao Bao, Zhengxi He, Yifan Jian, Xiaoye Wang, Haineng Huang and Xintai Song
Information 2025, 16(9), 731; https://doi.org/10.3390/info16090731 - 25 Aug 2025
Viewed by 366
Abstract
Pipelines play a vital role in material transportation within industrial settings. This review synthesizes detection technologies for early-stage small gas leaks from pipelines in the industrial sector, with a focus on acoustic-based methods, optical gas imaging (OGI), and multimodal fusion approaches. It encompasses [...] Read more.
Pipelines play a vital role in material transportation within industrial settings. This review synthesizes detection technologies for early-stage small gas leaks from pipelines in the industrial sector, with a focus on acoustic-based methods, optical gas imaging (OGI), and multimodal fusion approaches. It encompasses detection principles, inherent challenges, mitigation strategies, and the state of the art (SOTA). Small leaks refer to low flow leakage originating from defects with apertures at millimeter or submillimeter scales, posing significant detection difficulties. Acoustic detection leverages the acoustic wave signals generated by gas leaks for non-contact monitoring, offering advantages such as rapid response and broad coverage. However, its susceptibility to environmental noise interference often triggers false alarms. This limitation can be mitigated through time-frequency analysis, multi-sensor fusion, and deep-learning algorithms—effectively enhancing leak signals, suppressing background noise, and thereby improving the system’s detection robustness and accuracy. OGI utilizes infrared imaging technology to visualize leakage gas and is applicable to the detection of various polar gases. Its primary limitations include low image resolution, low contrast, and interference from complex backgrounds. Mitigation techniques involve background subtraction, optical flow estimation, fully convolutional neural networks (FCNNs), and vision transformers (ViTs), which enhance image contrast and extract multi-scale features to boost detection precision. Multimodal fusion technology integrates data from diverse sensors, such as acoustic and optical devices. Key challenges lie in achieving spatiotemporal synchronization across multiple sensors and effectively fusing heterogeneous data streams. Current methodologies primarily utilize decision-level fusion and feature-level fusion techniques. Decision-level fusion offers high flexibility and ease of implementation but lacks inter-feature interaction; it is less effective than feature-level fusion when correlations exist between heterogeneous features. Feature-level fusion amalgamates data from different modalities during the feature extraction phase, generating a unified cross-modal representation that effectively resolves inter-modal heterogeneity. In conclusion, we posit that multimodal fusion holds significant potential for further enhancing detection accuracy beyond the capabilities of existing single-modality technologies and is poised to become a major focus of future research in this domain. Full article
Show Figures

Figure 1

40 pages, 4344 KB  
Review
Digital Cardiovascular Twins, AI Agents, and Sensor Data: A Narrative Review from System Architecture to Proactive Heart Health
by Nurdaulet Tasmurzayev, Bibars Amangeldy, Baglan Imanbek, Zhanel Baigarayeva, Timur Imankulov, Gulmira Dikhanbayeva, Inzhu Amangeldi and Symbat Sharipova
Sensors 2025, 25(17), 5272; https://doi.org/10.3390/s25175272 - 24 Aug 2025
Viewed by 888
Abstract
Cardiovascular disease remains the world’s leading cause of mortality, yet everyday care still relies on episodic, symptom-driven interventions that detect ischemia, arrhythmias, and remodeling only after tissue damage has begun, limiting the effectiveness of therapy. A narrative review synthesized 183 studies published between [...] Read more.
Cardiovascular disease remains the world’s leading cause of mortality, yet everyday care still relies on episodic, symptom-driven interventions that detect ischemia, arrhythmias, and remodeling only after tissue damage has begun, limiting the effectiveness of therapy. A narrative review synthesized 183 studies published between 2016 and 2025 that were located through PubMed, MDPI, Scopus, IEEE Xplore, and Web of Science. This review examines CVD diagnostics using innovative technologies such as digital cardiovascular twins, which involve the collection of data from wearable IoT devices (electrocardiography (ECG), photoplethysmography (PPG), and mechanocardiography), clinical records, laboratory biomarkers, and genetic markers, as well as their integration with artificial intelligence (AI), including machine learning and deep learning, graph and transformer networks for interpreting multi-dimensional data streams and creating prognostic models, as well as generative AI, medical large language models (LLMs), and autonomous agents for decision support, personalized alerts, and treatment scenario modeling, and with cloud and edge computing for data processing. This multi-layered architecture enables the detection of silent pathologies long before clinical manifestations, transforming continuous observations into actionable recommendations and shifting cardiology from reactive treatment to predictive and preventive care. Evidence converges on four layers: sensors streaming multimodal clinical and environmental data; hybrid analytics that integrate hemodynamic models with deep-, graph- and transformer learning while Bayesian and Kalman filters manage uncertainty; decision support delivered by domain-tuned medical LLMs and autonomous agents; and prospective simulations that trial pacing or pharmacotherapy before bedside use, closing the prediction-intervention loop. This stack flags silent pathology weeks in advance and steers proactive personalized prevention. It also lays the groundwork for software-as-a-medical-device ecosystems and new regulatory guidance for trustworthy AI-enabled cardiovascular care. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

21 pages, 2893 KB  
Article
Intelligent Fault Diagnosis System for Running Gear of High-Speed Trains
by Shuai Yang, Guoliang Gao, Ziyang Wang, Shengfeng Zeng, Yikai Ouyang and Guanglei Zhang
Sensors 2025, 25(17), 5269; https://doi.org/10.3390/s25175269 - 24 Aug 2025
Viewed by 566
Abstract
Conventional rail transit train running gear fault diagnosis mainly depends on routine maintenance inspections and manual judgment. However, these approaches lack robustness under complex operational environments and elevated noise levels, rendering them inadequate for real-time performance and the rigorous accuracy standards demanded by [...] Read more.
Conventional rail transit train running gear fault diagnosis mainly depends on routine maintenance inspections and manual judgment. However, these approaches lack robustness under complex operational environments and elevated noise levels, rendering them inadequate for real-time performance and the rigorous accuracy standards demanded by modern rail transit systems. Furthermore, many existing deep learning–based methods suffer from inherent limitations in feature extraction or incur prohibitive computational costs when processing multivariate time series data. This study represents one of the early efforts to introduce the TimesNet time series modeling framework into the domain of fault diagnosis for rail transit train running gear. By utilizing an innovative multi-period decomposition strategy and a mechanism for reshaping one-dimensional data into two-dimensional tensors, the framework enables advanced temporal-spatial representation of time series data. Algorithm validation is performed on both the high-speed train running gear bearing fault dataset and the multi-mode fault diagnosis datasets of gearbox under variable working conditions. The TimesNet model exhibits outstanding diagnostic performance on both datasets, achieving a diagnostic accuracy of 91.7% on the high-speed train bearing fault dataset. Embedded deployment experiments demonstrate that single-sample inference is completed within 70.3 ± 5.8 ms, thereby satisfying the real-time monitoring requirement (<100 ms) with a 100% success rate over 50 consecutive tests. The two-dimensional reshaping approach inherent to TimesNet markedly enhances the capacity of the model to capture intrinsic periodic structures within multivariate time series data, presenting a novel paradigm for the intelligent fault diagnosis of complex mechanical systems in train running gears. The integrated human–machine interaction system includes a comprehensive closed-loop process encompassing detection, diagnosis, and decision-making, thereby laying a robust foundation for the continued development of train running gear predictive maintenance technologies. Full article
Show Figures

Figure 1

Back to TopTop