Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,673)

Search Parameters:
Keywords = multi-modal learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 1889 KB  
Article
Predicting Sarcopenia in Peritoneal Dialysis Patients: A Multimodal Ultrasound-Based Logistic Regression Analysis and Nomogram Model
by Shengqiao Wang, Xiuyun Lu, Juan Chen, Xinliang Xu, Jun Jiang and Yi Dong
Diagnostics 2025, 15(21), 2685; https://doi.org/10.3390/diagnostics15212685 (registering DOI) - 23 Oct 2025
Abstract
Objective: This study aimed to evaluate the diagnostic value of logistic regression and nomogram models based on multimodal ultrasound in predicting sarcopenia in patients with peritoneal dialysis (PD). Methods: A total of 178 patients with PD admitted to our nephrology department between June [...] Read more.
Objective: This study aimed to evaluate the diagnostic value of logistic regression and nomogram models based on multimodal ultrasound in predicting sarcopenia in patients with peritoneal dialysis (PD). Methods: A total of 178 patients with PD admitted to our nephrology department between June 2024 and April 2025 were enrolled. According to the 2019 Asian Working Group for Sarcopenia (AWGS) diagnostic criteria, patients were categorized into sarcopenia and non-sarcopenia groups. Ultrasound examinations were used to measure the muscle thickness (MT), pinna angle (PA), fascicle length (FL), attenuation coefficient (Atten Coe), and echo intensity (EI) of the right gastrocnemius medial head. The clinical characteristics of the groups were compared using the Mann–Whitney U test. Binary logistic regression was used to identify sarcopenia risk factors to construct clinical prediction models and nomograms. Receiver operating characteristic (ROC) curves were used to assess the model accuracy and stability. Results: The sarcopenia group exhibited significantly lower MT, PA, and FL, but higher Atten Coe and EI than the non-sarcopenia group (all p < 0.05). A multimodal ultrasound logistic regression model was developed using machine learning—Logit(P) = −7.29 − 1.18 × MT − 0.074 × PA + 0.48 × FL + 0.52 × Atten Coe + 0.13 × EI (p < 0.05)—achieving an F1-score of 0.785. The area under the ROC curve (ROC-AUC) was 0.902, with an optimal cut-off value of 0.45 (sensitivity 77.3%, specificity 56.7%). Nomogram consistency analysis showed no statistical difference between the ultrasound diagnosis and the appendicular skeletal muscle index (ASMI) measured by bioelectrical impedance analysis (BIA) (Z = 0.415, p > 0.05). Conclusions: The multimodal ultrasound-based prediction model effectively assists clinicians in identifying patients with PD at a high risk of sarcopenia, enabling early intervention to improve clinical outcomes. Full article
(This article belongs to the Section Medical Imaging and Theranostics)
Show Figures

Figure 1

22 pages, 1080 KB  
Article
Modeling the Internal and Contextual Attention for Self-Supervised Skeleton-Based Action Recognition
by Wentian Xin, Yue Teng, Jikang Zhang, Yi Liu, Ruyi Liu, Yuzhi Hu and Qiguang Miao
Sensors 2025, 25(21), 6532; https://doi.org/10.3390/s25216532 (registering DOI) - 23 Oct 2025
Abstract
Multimodal contrastive learning has achieved significant performance advantages in self-supervised skeleton-based action recognition. Previous methods are limited by modality imbalance, which reduces alignment accuracy and makes it difficult to combine important spatial–temporal frequency patterns, leading to confusion between modalities and weaker feature representations. [...] Read more.
Multimodal contrastive learning has achieved significant performance advantages in self-supervised skeleton-based action recognition. Previous methods are limited by modality imbalance, which reduces alignment accuracy and makes it difficult to combine important spatial–temporal frequency patterns, leading to confusion between modalities and weaker feature representations. To overcome these problems, we explore intra-modality feature-wise self-similarity and inter-modality instance-wise cross-consistency, and discover two inherent correlations that benefit recognition: (i) Global Perspective expresses how action semantics carry a broad and high-level understanding, which supports the use of globally discriminative feature representations. (ii) Focus Adaptation refers to the role of the frequency spectrum in guiding attention toward key joints by emphasizing compact and salient signal patterns. Building upon these insights, we propose a novel language–skeleton contrastive learning framework comprising two key components: (a) Feature Modulation, which constructs a skeleton–language action conceptual domain to minimize the expected information gain between vision and language modalities. (b) Frequency Feature Learning, which introduces a Frequency-domain Spatial–Temporal block (FreST) that focuses on sparse key human joints in the frequency domain with compact signal energy. Extensive experiments demonstrate the effectiveness of our method achieves remarkable action recognition performance on widely used benchmark datasets, including NTU RGB+D 60 and NTU RGB+D 120. Especially on the challenging PKU-MMD dataset, MICA has achieved at least a 4.6% improvement over classical methods such as CrosSCLR and AimCLR, effectively demonstrating its ability to capture internal and contextual attention information. Full article
(This article belongs to the Special Issue Deep Learning for Perception and Recognition: Method and Applications)
26 pages, 1979 KB  
Review
From Single-Sensor Constraints to Multisensor Integration: Advancing Sustainable Complex Ore Sorting
by Sefiu O. Adewuyi, Angelina Anani, Kray Luxbacher and Sehliselo Ndlovu
Minerals 2025, 15(11), 1101; https://doi.org/10.3390/min15111101 (registering DOI) - 23 Oct 2025
Abstract
Processing complex ore remains a challenge due to energy-intensive grinding and complex beneficiation and pyrometallurgical treatments that consume large amounts of water whilst generating significant waste and polluting the environment. Sensor-based ore sorting, which separates ore particles based on their physical or chemical [...] Read more.
Processing complex ore remains a challenge due to energy-intensive grinding and complex beneficiation and pyrometallurgical treatments that consume large amounts of water whilst generating significant waste and polluting the environment. Sensor-based ore sorting, which separates ore particles based on their physical or chemical properties before downstream processing, is emerging as a transformative technology in mineral processing. However, its application to complex and heterogeneous ores remain limited by the constraints of single-sensor systems. In addition, existing hybrid sensor strategies are fragmented and a consolidated framework for implementation is lacking. This review explores these challenges and underscores the potential of multimodal sensor integration for complex ore pre-concentration. A multi-sensor framework integrating machine learning and computer vision is proposed to overcome limitations in handling complex ores and enhance sorting efficiency. This approach can improve recovery rates, reduce energy and water consumption, and optimize process performance, thereby supporting more sustainable mining practices that contribute to the United Nations Sustainable Development Goals (UNSDGs). This work provides a roadmap for advancing efficient, resilient, and next-generation mineral processing operations. Full article
(This article belongs to the Section Mineral Processing and Extractive Metallurgy)
Show Figures

Graphical abstract

25 pages, 2557 KB  
Article
Modality-Resilient Multimodal Industrial Anomaly Detection via Cross-Modal Knowledge Transfer and Dynamic Edge-Preserving Voxelization
by Jiahui Xu, Jian Yuan, Mingrui Yang and Weishu Yan
Sensors 2025, 25(21), 6529; https://doi.org/10.3390/s25216529 (registering DOI) - 23 Oct 2025
Abstract
Achieving high-precision anomaly detection with incomplete sensor data is a critical challenge in industrial automation and intelligent manufacturing. This incompleteness often results from sensor failures, environmental interference, occlusions, or acquisition cost constraints. This study explicitly targets both types of incompleteness commonly encountered in [...] Read more.
Achieving high-precision anomaly detection with incomplete sensor data is a critical challenge in industrial automation and intelligent manufacturing. This incompleteness often results from sensor failures, environmental interference, occlusions, or acquisition cost constraints. This study explicitly targets both types of incompleteness commonly encountered in industrial multimodal inspection: (i) incomplete sensor data within a given modality, such as partial point cloud loss or image degradation, and (ii) incomplete modalities, where one sensing channel (RGB or 3D) is entirely unavailable. By jointly addressing intra-modal incompleteness and cross-modal absence within a unified cross-distillation framework, our approach enhances anomaly detection robustness under both conditions. First, a teacher–student cross-modal distillation mechanism enables robust feature learning from both RGB and 3D modalities, allowing the student network to accurately detect anomalies even when a modality is missing during inference. Second, a dynamic voxel resolution adjustment with edge-retention strategy alleviates the computational burden of 3D point cloud processing while preserving crucial geometric features. By jointly enhancing robustness to missing modalities and improving computational efficiency, our method offers a resilient and practical solution for anomaly detection in real-world manufacturing scenarios. Extensive experiments demonstrate that the proposed method achieves both high robustness and efficiency across multiple industrial scenarios, establishing new state-of-the-art performance that surpasses existing approaches in both accuracy and speed. This method provides a robust solution for high-precision perception under complex detection conditions, significantly enhancing the feasibility of deploying anomaly detection systems in real industrial environments. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Figure 1

22 pages, 10534 KB  
Article
M3ASD: Integrating Multi-Atlas and Multi-Center Data via Multi-View Low-Rank Graph Structure Learning for Autism Spectrum Disorder Diagnosis
by Shuo Yang, Zuohao Yin, Yue Ma, Meiling Wang, Shuo Huang and Li Zhang
Brain Sci. 2025, 15(11), 1136; https://doi.org/10.3390/brainsci15111136 (registering DOI) - 23 Oct 2025
Abstract
Background: Autism spectrum disorder (ASD) is a highly heterogeneous neurodevelopmental condition for which accurate and automated diagnosis is crucial to enable timely intervention. Resting-state functional magnetic resonance imaging (rs-fMRI) serves as one of the key modalities for diagnosing ASD and elucidating its underlying [...] Read more.
Background: Autism spectrum disorder (ASD) is a highly heterogeneous neurodevelopmental condition for which accurate and automated diagnosis is crucial to enable timely intervention. Resting-state functional magnetic resonance imaging (rs-fMRI) serves as one of the key modalities for diagnosing ASD and elucidating its underlying mechanisms. Numerous existing studies using rs-fMRI data have achieved accurate diagnostic performance. However, these methods often rely on a single brain atlas for constructing brain networks and overlook the data heterogeneity caused by variations in imaging devices, acquisition parameters, and processing pipelines across multiple centers. Methods: To address these limitations, this paper proposes a multi-view, low-rank subspace graph structure learning method to integrate multi-atlas and multi-center data for automated ASD diagnosis, termed M3ASD. The proposed framework first constructs functional connectivity matrices from multi-center neuroimaging data using multiple brain atlases. Edge weight filtering is then applied to build multiple brain networks with diverse topological properties, forming several complementary views. Samples from different classes are separately projected into low-rank subspaces within each view to mitigate data heterogeneity. Multi-view consistency regularization is further incorporated to extract more consistent and discriminative features from the low-rank subspaces across views. Results: Experimental results on the ABIDE-I dataset demonstrate that our model achieves an accuracy of 83.21%, outperforming most existing methods and confirming its effectiveness. Conclusions: The proposed method was validated using the publicly available Autism Brain Imaging Data Exchange (ABIDE) dataset. Experimental results demonstrate that the M3ASD method not only improves ASD diagnostic accuracy but also identifies common functional brain connections across atlases, thereby enhancing the interpretability of the method. Full article
Show Figures

Figure 1

24 pages, 5556 KB  
Article
Efficient Wearable Sensor-Based Activity Recognition for Human–Robot Collaboration in Agricultural Environments
by Sakorn Mekruksavanich and Anuchit Jitpattanakul
Informatics 2025, 12(4), 115; https://doi.org/10.3390/informatics12040115 - 23 Oct 2025
Abstract
This study focuses on human awareness, a critical component in human–robot interaction, particularly within agricultural environments where interactions are enriched by complex contextual information. The main objective is identifying human activities occurring during collaborative harvesting tasks involving humans and robots. To achieve this, [...] Read more.
This study focuses on human awareness, a critical component in human–robot interaction, particularly within agricultural environments where interactions are enriched by complex contextual information. The main objective is identifying human activities occurring during collaborative harvesting tasks involving humans and robots. To achieve this, we propose a novel and lightweight deep learning model, named 1D-ResNeXt, designed explicitly for recognizing activities in agriculture-related human–robot collaboration. The model is built as an end-to-end architecture incorporating feature fusion and a multi-kernel convolutional block strategy. It utilizes residual connections and a split–transform–merge mechanism to mitigate performance degradation and reduce model complexity by limiting the number of trainable parameters. Sensor data were collected from twenty individuals with five wearable devices placed on different body parts. Each sensor was embedded with tri-axial accelerometers, gyroscopes, and magnetometers. Under real field conditions, the participants performed several sub-tasks commonly associated with agricultural labor, such as lifting and carrying loads. Before classification, the raw sensor signals were pre-processed to eliminate noise. The cleaned time-series data were then input into the proposed deep learning network for sequential pattern recognition. Experimental results showed that the chest-mounted sensor achieved the highest F1-score of 99.86%, outperforming other sensor placements and combinations. An analysis of temporal window sizes (0.5, 1.0, 1.5, and 2.0 s) demonstrated that the 0.5 s window provided the best recognition performance, indicating that key activity features in agriculture can be captured over short intervals. Moreover, a comprehensive evaluation of sensor modalities revealed that multimodal fusion of accelerometer, gyroscope, and magnetometer data yielded the best accuracy at 99.92%. The combination of accelerometer and gyroscope data offered an optimal compromise, achieving 99.49% accuracy while maintaining lower system complexity. These findings highlight the importance of strategic sensor placement and data fusion in enhancing activity recognition performance while reducing the need for extensive data and computational resources. This work contributes to developing intelligent, efficient, and adaptive collaborative systems, offering promising applications in agriculture and beyond, with improved safety, cost-efficiency, and real-time operational capability. Full article
Show Figures

Figure 1

21 pages, 584 KB  
Review
Beyond Imaging: Integrating Radiomics, Genomics, and Multi-Omics for Precision Breast Cancer Management
by Xiaorong Wu and Wei Dai
Cancers 2025, 17(21), 3408; https://doi.org/10.3390/cancers17213408 - 23 Oct 2025
Abstract
Radiomics has emerged as a promising tool for non-invasive tumour phenotyping in breast cancer, providing valuable insights into tumour heterogeneity, response prediction, and risk stratification. However, traditional radiomic approaches often rely on correlative patterns of image analysis to clinical data and lack direct [...] Read more.
Radiomics has emerged as a promising tool for non-invasive tumour phenotyping in breast cancer, providing valuable insights into tumour heterogeneity, response prediction, and risk stratification. However, traditional radiomic approaches often rely on correlative patterns of image analysis to clinical data and lack direct biological interpretability. Combining information provided by radiomics with genomics or other multi-omics data can be important to personalise diagnostic and therapeutic work up in breast cancer management. This review aims to explore the current progress in integrating radiomics with multi-omics data—genomics and transcriptomics—to establish biologically grounded, multidimensional models for precision management of breast cancer. We will review recent advances in integrative radiomics and radiogenomics, highlight the synergy between imaging and molecular profiling, and discuss emerging machine learning methodologies that facilitate the integration of high-dimensional data. Applications of radiogenomics, including breast cancer subtype and molecular mutation prediction, radiogenomic mapping of the tumour immune microenvironment, and response forecasting to immunotherapy and targeted therapies, as well as lymph nodes involvement, will be evaluated. Challenges in technical limitations including imaging modalities harmonization, interpretability, and advancing machine learning methodologies will be addressed. This review positions integrative radiogenomics as a driving force for next-generation breast cancer care. Full article
(This article belongs to the Special Issue Radiomics in Cancer)
Show Figures

Figure 1

23 pages, 6498 KB  
Article
A Cross-Modal Deep Feature Fusion Framework Based on Ensemble Learning for Land Use Classification
by Xiaohuan Wu, Houji Qi, Keli Wang, Yikun Liu and Yang Wang
ISPRS Int. J. Geo-Inf. 2025, 14(11), 411; https://doi.org/10.3390/ijgi14110411 - 23 Oct 2025
Abstract
Land use classification based on multi-modal data fusion has gained significant attention due to its potential to capture the complex characteristics of urban environments. However, effectively extracting and integrating discriminative features derived from heterogeneous geospatial data remain challenging. This study proposes an ensemble [...] Read more.
Land use classification based on multi-modal data fusion has gained significant attention due to its potential to capture the complex characteristics of urban environments. However, effectively extracting and integrating discriminative features derived from heterogeneous geospatial data remain challenging. This study proposes an ensemble learning framework for land use classification by fusing cross-modal deep features from both physical and socioeconomic perspectives. Specifically, the framework utilizes the Masked Autoencoder (MAE) to extract global spatial dependencies from remote sensing imagery and applies long short-term memory (LSTM) networks to model spatial distribution patterns of points of interest (POIs) based on type co-occurrence. Furthermore, we employ inter-modal contrastive learning to enhance the representation of physical and socioeconomic features. To verify the superiority of the ensemble learning framework, we apply it to map the land use distribution of Bejing. By coupling various physical and socioeconomic features, the framework achieves an average accuracy of 84.33 %, surpassing several comparative baseline methods. Furthermore, the framework demonstrates comparable performance when applied to a Shenzhen dataset, confirming its robustness and generalizability. The findings highlight the importance of fully extracting and effectively integrating multi-source deep features in land use classification, providing a robust solution for urban planning and sustainable development. Full article
Show Figures

Figure 1

28 pages, 2038 KB  
Article
Cognitive-Inspired Multimodal Learning Framework for Hazard Identification in Highway Construction with BIM–GIS Integration
by Jibiao Zhou, Zewei Li, Zhan Shi, Xinhua Mao and Chao Gao
Sustainability 2025, 17(21), 9395; https://doi.org/10.3390/su17219395 - 22 Oct 2025
Abstract
Highway construction remains one of the most hazardous sectors in the infrastructure domain, where persistent accident rates challenge the vision of sustainable and safe development. Traditional hazard identification methods rely on manual inspections that are often slow, error-prone, and unable to cope with [...] Read more.
Highway construction remains one of the most hazardous sectors in the infrastructure domain, where persistent accident rates challenge the vision of sustainable and safe development. Traditional hazard identification methods rely on manual inspections that are often slow, error-prone, and unable to cope with complex and dynamic site conditions. To address these limitations, this study develops a cognitive-inspired multimodal learning framework integrated with BIM–GIS-enabled digital twins to advance intelligent hazard identification and digital management for highway construction safety. The framework introduces three key innovations: a biologically grounded attention mechanism that simulates inspector search behavior, an adaptive multimodal fusion strategy that integrates visual, textual, and sensor information, and a closed-loop digital twin platform that synchronizes physical and virtual environments in real time. The system was validated across five highway construction projects over an 18-month period. Results show that the framework achieved a hazard detection accuracy of 91.7% with an average response time of 147 ms. Compared with conventional computer vision methods, accuracy improved by 18.2%, while gains over commercial safety systems reached 24.8%. Field deployment demonstrated a 34% reduction in accidents and a 42% increase in inspection efficiency, delivering a positive return on investment within 8.7 months. By linking predictive safety analytics with BIM–GIS semantics and site telemetry, the framework enhances construction safety, reduces delays and rework, and supports more resource-efficient, low-disruption project delivery, highlighting its potential as a sustainable pathway toward zero-accident highway construction. Full article
Show Figures

Figure 1

72 pages, 2054 KB  
Article
Neural Network IDS/IPS Intrusion Detection and Prevention System with Adaptive Online Training to Improve Corporate Network Cybersecurity, Evidence Recording, and Interaction with Law Enforcement Agencies
by Serhii Vladov, Victoria Vysotska, Svitlana Vashchenko, Serhii Bolvinov, Serhii Glubochenko, Andrii Repchonok, Maksym Korniienko, Mariia Nazarkevych and Ruslan Herasymchuk
Big Data Cogn. Comput. 2025, 9(11), 267; https://doi.org/10.3390/bdcc9110267 - 22 Oct 2025
Abstract
Thise article examines the reliable online detection and IDS/IPS intrusion prevention in dynamic corporate networks problems, where traditional signature-based methods fail to keep pace with new and evolving attacks, and streaming data is susceptible to drift and targeted “poisoning” of the training dataset. [...] Read more.
Thise article examines the reliable online detection and IDS/IPS intrusion prevention in dynamic corporate networks problems, where traditional signature-based methods fail to keep pace with new and evolving attacks, and streaming data is susceptible to drift and targeted “poisoning” of the training dataset. As a solution, we propose a hybrid neural network system with adaptive online training, a formal minimax false-positive control framework, and a robustness mechanism set (a Huber model, pruned learning rate, DRO, a gradient-norm regularizer, and a prioritized replay). In practice, the system combines modal encoders for traffic, logs, and metrics; a temporal GNN for entity correlation; a variational module for uncertainty assessment; a differentiable symbolic unit for logical rules; an RL agent for incident prioritization; and an NLG module for explanations and the preparation of forensically relevant artifacts. In this case, the applied components are connected via a cognitive layer (cross-modal fusion memory), a Bayesian-neural network fuser, and a single multi-task loss function. The practical implementation includes the pipeline “novelty detection → active labelling → incremental supervised update” and chain-of-custody mechanisms for evidential fitness. A significant improvement in quality has been experimentally demonstrated, since the developed system achieves an ROC AUC of 0.96, an F1-score of 0.95, and a significantly lower FPR compared to basic architectures (MLP, CNN, and LSTM). In applied validation tasks, detection rates of ≈92–94% and resistance to distribution drift are noted. Full article
(This article belongs to the Special Issue Internet Intelligence for Cybersecurity)
22 pages, 4655 KB  
Article
Rural Settlement Mapping and Its Spatiotemporal Dynamics Monitoring in the Yellow River Delta Using Multi-Modal Fusion of Landsat Optical and Sentinel-1 SAR Polarimetric Decomposition Data by Leveraging Deep Learning
by Jiantao Liu, Yan Zhang, Fei Meng, Jianhua Gong, Dong Zhang, Yu Peng and Can Zhang
Remote Sens. 2025, 17(21), 3512; https://doi.org/10.3390/rs17213512 - 22 Oct 2025
Abstract
The Yellow River Delta (YRD) is a vital agricultural and ecologically fragile zone in China. Understanding the spatial pattern and evolutionary characteristics of Rural Settlements Area (RSA) in this region is crucial for both ecological protection and sustainable development. This study focuses on [...] Read more.
The Yellow River Delta (YRD) is a vital agricultural and ecologically fragile zone in China. Understanding the spatial pattern and evolutionary characteristics of Rural Settlements Area (RSA) in this region is crucial for both ecological protection and sustainable development. This study focuses on Dongying, a key YRD city, and compares four advanced deep learning models—U-Net, DeepLabv3+, TransUNet, and TransDeepLab—using fused Sentinel-1 radar and Landsat optical imagery to identify the optimal method for RSA mapping. Results show that TransUNet, integrating polarization and optical features, achieves the highest accuracy, with Precision, Recall, F1 score, and mIoU of 89.27%, 80.70%, 84.77%, and 85.39%, respectively. Accordingly, TransUNet was applied for the spatiotemporal extraction of RSA in 2002, 2008, 2015, 2019, and 2023. The results indicate that medium-sized settlements dominate, showing a “dense in the west/south, sparse in the east/north” pattern with clustered distribution. Settlement patches are generally regular but grow more complex over time while maintaining strong connectivity. In summary, the proposed method offers technical support for RSA identification in the YRD, and the extracted multi-temporal settlement data can serve as a valuable reference for optimizing settlement layout in the region. Full article
Show Figures

Figure 1

39 pages, 1188 KB  
Review
A Scoping Review of AI-Based Approaches for Detecting Autism Traits Using Voice and Behavioral Data
by Hajarimino Rakotomanana and Ghazal Rouhafzay
Bioengineering 2025, 12(11), 1136; https://doi.org/10.3390/bioengineering12111136 - 22 Oct 2025
Abstract
This scoping review systematically maps the rapidly evolving application of Artificial Intelligence (AI) in Autism Spectrum Disorder (ASD) diagnostics, specifically focusing on computational behavioral phenotyping. Recognizing that observable traits like speech and movement are critical for early, timely intervention, the study synthesizes AI’s [...] Read more.
This scoping review systematically maps the rapidly evolving application of Artificial Intelligence (AI) in Autism Spectrum Disorder (ASD) diagnostics, specifically focusing on computational behavioral phenotyping. Recognizing that observable traits like speech and movement are critical for early, timely intervention, the study synthesizes AI’s use across eight key behavioral modalities. These include voice biomarkers, conversational dynamics, linguistic analysis, movement analysis, activity recognition, facial gestures, visual attention, and multimodal approaches. The review analyzed 158 studies published between 2015 and 2025, revealing that modern Machine Learning and Deep Learning techniques demonstrate highly promising diagnostic performance in controlled environments, with reported accuracies of up to 99%. Despite this significant capability, the review identifies critical challenges that impede clinical implementation and generalizability. These persistent limitations include pervasive issues with dataset heterogeneity, gender bias in samples, and small overall sample sizes. By detailing the current landscape of observable data types, computational methodologies, and available datasets, this work establishes a comprehensive overview of AI’s current strengths and fundamental weaknesses in ASD diagnosis. The article concludes by providing actionable recommendations aimed at guiding future research toward developing diagnostic solutions that are more inclusive, generalizable, and ultimately applicable in clinical settings. Full article
Show Figures

Figure 1

25 pages, 1741 KB  
Article
Event-Aware Multimodal Time-Series Forecasting via Symmetry-Preserving Graph-Based Cross-Regional Transfer Learning
by Shu Cao and Can Zhou
Symmetry 2025, 17(11), 1788; https://doi.org/10.3390/sym17111788 - 22 Oct 2025
Abstract
Forecasting real-world time series in domains with strong event sensitivity and regional variability poses unique challenges, as predictive models must account for sudden disruptions, heterogeneous contextual factors, and structural differences across locations. In tackling these challenges, we draw on the concept of symmetry [...] Read more.
Forecasting real-world time series in domains with strong event sensitivity and regional variability poses unique challenges, as predictive models must account for sudden disruptions, heterogeneous contextual factors, and structural differences across locations. In tackling these challenges, we draw on the concept of symmetry that refers to the balance and invariance patterns across temporal, multimodal, and structural dimensions, which help reveal consistent relationships and recurring patterns within complex systems. This study is based on two multimodal datasets covering 12 tourist regions and more than 3 years of records, ensuring robustness and practical relevance of the results. In many applications, such as monitoring economic indicators, assessing operational performance, or predicting demand patterns, short-term fluctuations are often triggered by discrete events, policy changes, or external incidents, which conventional statistical and deep learning approaches struggle to model effectively. To address these limitations, we propose an event-aware multimodal time-series forecasting framework with graph-based regional transfer built upon an enhanced PatchTST backbone. The framework unifies multimodal feature extraction, event-sensitive temporal reasoning, and graph-based structural adaptation. Unlike Informer, Autoformer, FEDformer, or PatchTST, our model explicitly addresses naive multimodal fusion, event-agnostic modeling, and weak cross-regional transfer by introducing an event-aware Multimodal Encoder, a Temporal Event Reasoner, and a Multiscale Graph Module. Experiments on diverse multi-region multimodal datasets demonstrate that our method achieves substantial improvements over eight state-of-the-art baselines in forecasting accuracy, event response modeling, and transfer efficiency. Specifically, our model achieves a 15.06% improvement in the event recovery index, a 15.1% reduction in MAE, and a 19.7% decrease in event response error compared to PatchTST, highlighting its empirical impact on tourism event economics forecasting. Full article
Show Figures

Figure 1

21 pages, 1732 KB  
Review
Artificial Intelligence in Clinical Oncology: From Productivity Enhancement to Creative Discovery
by Masahiro Kuno, Hiroki Osumi, Shohei Udagawa, Kaoru Yoshikawa, Akira Ooki, Eiji Shinozaki, Tetsuo Ishikawa, Junna Oba, Kensei Yamaguchi and Kazuhiro Sakurada
Curr. Oncol. 2025, 32(11), 588; https://doi.org/10.3390/curroncol32110588 - 22 Oct 2025
Abstract
Modern clinical oncology faces an unprecedented data complexity that exceeds human analytical capacity, making artificial intelligence (AI) integration essential rather than optional. This review examines the dual impact of AI on productivity enhancement and creative discovery in cancer care. We trace the evolution [...] Read more.
Modern clinical oncology faces an unprecedented data complexity that exceeds human analytical capacity, making artificial intelligence (AI) integration essential rather than optional. This review examines the dual impact of AI on productivity enhancement and creative discovery in cancer care. We trace the evolution from traditional machine learning to deep learning and transformer-based foundation models, analyzing their clinical applications. AI enhances productivity by automating diagnostic tasks, streamlining documentation, and accelerating research workflows across imaging modalities and clinical data processing. More importantly, AI enables creative discovery by integrating multimodal data to identify computational biomarkers, performing unsupervised phenotyping to reveal hidden patient subgroups, and accelerating drug development. Finally, we introduce the FUTURE-AI framework, outlining the essential requirements for translating AI models into clinical practice. This ensures the responsible deployment of AI, which augments rather than replaces clinical judgment, while maintaining patient-centered care. Full article
Show Figures

Figure 1

12 pages, 3307 KB  
Article
Redefining MRI-Based Skull Segmentation Through AI-Driven Multimodal Integration
by Michel Beyer, Alexander Aigner, Alexandru Burde, Alexander Brasse, Sead Abazi, Lukas B. Seifert, Jakob Wasserthal, Martin Segeroth, Mohamed Omar and Florian M. Thieringer
J. Imaging 2025, 11(11), 372; https://doi.org/10.3390/jimaging11110372 - 22 Oct 2025
Abstract
Skull segmentation in magnetic resonance imaging (MRI) is essential for cranio-maxillofacial (CMF) surgery planning, yet manual approaches are time-consuming and error-prone. Computed tomography (CT) provides superior bone contrast but exposes patients to ionizing radiation, which is particularly concerning in pediatric care. This study [...] Read more.
Skull segmentation in magnetic resonance imaging (MRI) is essential for cranio-maxillofacial (CMF) surgery planning, yet manual approaches are time-consuming and error-prone. Computed tomography (CT) provides superior bone contrast but exposes patients to ionizing radiation, which is particularly concerning in pediatric care. This study presents an AI-based workflow that enables skull segmentation directly from routine MRI. Using 186 paired CT–MRI datasets, CT-based segmentations were transferred to MRI via multimodal registration to train dedicated deep learning models. Performance was evaluated against manually segmented CT ground truth using Dice Similarity Coefficient (DSC), Mean Surface Distance (MSD), and Hausdorff Distance (HD). AI achieved higher performance on CT (DSC 0.981) than MRI (DSC 0.864), with MSD and HD also favoring CT. Despite lower absolute accuracy on MRI, the approach substantially improved segmentation quality compared with manual MRI methods, particularly in clinically relevant regions. This automated method enables accurate skull modeling from standard MRI without radiation exposure or specialized sequences. While CT remains more precise, the presented framework enhances MRI utility in surgical planning, reduces manual workload, and supports safer, patient-specific treatment, especially for pediatric and trauma cases. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

Back to TopTop