Saved Queries

Amid the global challenges of rapid urbanization, understanding how micro-scale spatial features shape human perception is critical for advancing livable cities. This study pro-poses a data-driven framework that integrates street view imagery, deep learning-based semantic segmentation, and machine learning interpretation models including SHAP analysis to explore the relationship between urban spatial characteristics and subjective perceptions. A total of 12,604 street-level images from Xi’an, China, were analyzed to ex-tract seven spatial indicators. These indicators were then linked with perceptual data across six emotional dimensions derived from the Place Pulse 2.0 dataset. The analysis revealed that natural elements significantly enhance perceived comfort and aesthetics, while high-density built environments can suppress perceived safety and liveliness. Spatial clustering further identified three urban typologies—traditional, transitional, and modern—with distinct perceptual signatures. These findings offer scalable and transferable insights for perception-informed urban design and renewal, particularly in dense urban settings worldwide. Full article

(This article belongs to the Special Issue Computational Design for Low-Carbon and Climate-Responsive Architecture and Urban Environments)

►▼ Show Figures

Figure 1

54 pages, 1801 KB

Open AccessReview

Research Progress in Multi-Domain and Cross-Domain AI Management and Control for Intelligent Electric Vehicles

by Dagang Lu, Yu Chen, Yan Sun, Wenxuan Wei, Shilin Ji, Hongshuo Ruan, Fengyan Yi, Chunchun Jia, Donghai Hu, Kunpeng Tang, Song Huang and Jing Wang

Energies 2025, 18(17), 4597; https://doi.org/10.3390/en18174597 - 29 Aug 2025

Viewed by 107

Abstract

Recent breakthroughs in artificial intelligence are accelerating the intelligent transformation of vehicles. Vehicle electronic and electrical architectures are converging toward centralized domain controllers. Deep learning, reinforcement learning, and deep reinforcement learning now form the core technologies of domain control. This review surveys advances in deep reinforcement learning in four vehicle domains: intelligent driving, powertrain, chassis, and cockpit. It identifies the main tasks and active research fronts in each domain. In intelligent driving, deep reinforcement learning handles object detection, object tracking, vehicle localization, trajectory prediction, and decision making. In the powertrain domain, it improves power regulation, energy management, and thermal management. In the chassis domain, it enables precise steering, braking, and suspension control. In the cockpit domain, it supports occupant monitoring, comfort regulation, and human–machine interaction. The review then synthesizes research on cross-domain fusion. It identifies transfer learning as a crucial method to address scarce training data and poor generalization. These limits still hinder large-scale deployment of deep reinforcement learning in intelligent electric vehicle domain control. The review closes with future directions: rigorous safety assurance, real-time implementation, and scalable on-board learning. It offers a roadmap for the continued evolution of deep-reinforcement-learning-based vehicle domain control technology. Full article

36 pages, 10083 KB

Open AccessArticle

Hierarchical Deep Feature Fusion and Ensemble Learning for Enhanced Brain Tumor MRI Classification

by Zahid Ullah and Jihie Kim

Mathematics 2025, 13(17), 2787; https://doi.org/10.3390/math13172787 - 29 Aug 2025

Viewed by 182

Abstract

Accurate classification of brain tumors in medical imaging is crucial to ensure reliable diagnoses and effective treatment planning. This study introduces a novel double ensemble framework that synergistically combines pre-trained Deep Learning (DL) models for feature extraction with optimized Machine Learning (ML) classifiers for robust classification. The framework incorporates comprehensive preprocessing and data augmentation of brain Magnetic Resonance Images (MRIs), followed by deep feature extraction based on transfer learning using pre-trained Vision Transformer (ViT) networks. The novelty of our approach lies in its dual-level ensemble strategy: we employ a feature-level ensemble, which integrates deep features from the top-performing ViT models, and a classifier-level ensemble, which aggregates predictions from various hyperparameter-optimized ML classifiers. Experiments on two public MRI brain tumor datasets from Kaggle demonstrate that this approach significantly surpasses state-of-the-art methods, underscoring the importance of feature and classifier fusion. The proposed methodology also highlights the critical roles that hyperparameter optimization and advanced preprocessing techniques can play in improving the diagnostic accuracy and reliability of medical image analysis, advancing the integration of DL and ML in this vital, clinically relevant task. Full article

►▼ Show Figures

Figure 1

19 pages, 13244 KB

Open AccessArticle

MWR-Net: An Edge-Oriented Lightweight Framework for Image Restoration in Single-Lens Infrared Computational Imaging

by Xuanyu Qian, Xuquan Wang, Yujie Xing, Guishuo Yang, Xiong Dun, Zhanshan Wang and Xinbin Cheng

Remote Sens. 2025, 17(17), 3005; https://doi.org/10.3390/rs17173005 - 29 Aug 2025

Viewed by 185

Abstract

Infrared video imaging is an cornerstone technology for environmental perception, particularly in drone-based remote sensing applications such as disaster assessment and infrastructure inspection. Conventional systems, however, rely on bulky optical architectures that limit deployment on lightweight aerial platforms. Computational imaging offers a promising alternative by integrating optical encoding with algorithmic reconstruction, enabling compact hardware while maintaining imaging performance comparable to sophisticated multi-lens systems. Nonetheless, achieving real-time video-rate computational image restoration on resource-constrained unmanned aerial vehicles (UAVs) remains a critical challenge. To address this, we propose Mobile Wavelet Restoration-Net (MWR-Net), a lightweight deep learning framework tailored for real-time infrared image restoration. Built on a MobileNetV4 backbone, MWR-Net leverages depthwise separable convolutions and an optimized downsampling scheme to minimize parameters and computational overhead. A novel wavelet-domain loss enhances high-frequency detail recovery, while the modulation transfer function (MTF) is adopted as an optics-aware evaluation metric. With only 666.37 K parameters and 6.17 G MACs, MWR-Net achieves a PSNR of 37.10 dB and an SSIM of 0.964 on a custom dataset, outperforming a pruned U-Net baseline. Deployed on an RK3588 chip, it runs at 42 FPS. These results demonstrate MWR-Net’s potential as an efficient and practical solution for UAV-based infrared sensing applications. Full article

(This article belongs to the Special Issue Advances in Remote Sensing Video Data Processing: Theories, Technologies and Applications)

►▼ Show Figures

Figure 1

19 pages, 5315 KB

Open AccessFeature PaperArticle

Style-Aware and Uncertainty-Guided Approach to Semi-Supervised Domain Generalization in Medical Imaging

by Zineb Tissir, Yunyoung Chang and Sang-Woong Lee

Mathematics 2025, 13(17), 2763; https://doi.org/10.3390/math13172763 - 28 Aug 2025

Viewed by 223

Abstract

Deep learning has significantly advanced medical image analysis by enabling accurate, automated diagnosis across diverse clinical tasks such as lesion classification and disease detection. However, the practical deployment of these systems is still hindered by two major challenges: the limited availability of expert-annotated data and substantial domain shifts caused by variations in imaging devices, acquisition protocols, and patient populations. Although recent semi-supervised domain generalization (SSDG) approaches attempt to address these challenges, they often suffer from two key limitations: (i) reliance on computationally expensive uncertainty modeling techniques such as Monte Carlo dropout, and (ii) inflexible shared-head classifiers that fail to capture domain-specific variability across heterogeneous imaging styles. To overcome these limitations, we propose MultiStyle-SSDG, a unified semi-supervised domain generalization framework designed to improve model generalization in low-label scenarios. Our method introduces a multi-style ensemble pseudo-labeling strategy guided by entropy-based filtering, incorporates prototype-based conformity and semantic alignment to regularize the feature space, and employs a domain-specific multi-head classifier fused through attention-weighted prediction. Additionally, we introduce a dual-level neural-style transfer pipeline that simulates realistic domain shifts while preserving diagnostic semantics. We validated our framework on the ISIC2019 skin lesion classification benchmark using 5% and 10% labeled data. MultiStyle-SSDG consistently outperformed recent state-of-the-art methods such as FixMatch, StyleMatch, and UPLM, achieving statistically significant improvements in classification accuracy under simulated domain shifts including style, background, and corruption. Specifically, our method achieved 78.6% accuracy with 5% labeled data and 80.3% with 10% labeled data on ISIC2019, surpassing FixMatch by 4.9–5.3 percentage points and UPLM by 2.1–2.4 points. Ablation studies further confirmed the individual contributions of each component, and t-SNE visualizations illustrate enhanced intra-class compactness and cross-domain feature consistency. These results demonstrate that our style-aware, modular framework offers a robust and scalable solution for generalizable computer-aided diagnosis in real-world medical imaging settings. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

►▼ Show Figures

Figure 1

29 pages, 11689 KB

Open AccessArticle

Enhanced Breast Cancer Diagnosis Using Multimodal Feature Fusion with Radiomics and Transfer Learning

by Nazmul Ahasan Maruf, Abdullah Basuhail and Muhammad Umair Ramzan

Diagnostics 2025, 15(17), 2170; https://doi.org/10.3390/diagnostics15172170 - 28 Aug 2025

Viewed by 321

Abstract

Background: Breast cancer remains a critical public health problem worldwide and is a leading cause of cancer-related mortality. Optimizing clinical outcomes is contingent upon the early and precise detection of malignancies. Advances in medical imaging and artificial intelligence (AI), particularly in the fields of radiomics and deep learning (DL), have contributed to improvements in early detection methodologies. Nonetheless, persistent challenges, including limited data availability, model overfitting, and restricted generalization, continue to hinder performance. Methods: This study aims to overcome existing challenges by improving model accuracy and robustness through enhanced data augmentation and the integration of radiomics and deep learning features from the CBIS-DDSM dataset. To mitigate overfitting and improve model generalization, data augmentation techniques were applied. The PyRadiomics library was used to extract radiomics features, while transfer learning models were employed to derive deep learning features from the augmented training dataset. For radiomics feature selection, we compared multiple supervised feature selection methods, including RFE with random forest and logistic regression, ANOVA F-test, LASSO, and mutual information. Embedded methods with XGBoost, LightGBM, and CatBoost for GPUs were also explored. Finally, we integrated radiomics and deep features to build a unified multimodal feature space for improved classification performance. Based on this integrated set of radiomics and deep learning features, 13 pre-trained transfer learning models were trained and evaluated, including various versions of ResNet (50, 50V2, 101, 101V2, 152, 152V2), DenseNet (121, 169, 201), InceptionV3, MobileNet, and VGG (16, 19). Results: Among the evaluated models, ResNet152 achieved the highest classification accuracy of 97%, demonstrating the potential of this approach to enhance diagnostic precision. Other models, including VGG19, ResNet101V2, and ResNet101, achieved 96% accuracy, emphasizing the importance of the selected feature set in achieving robust detection. Conclusions: Future research could build on this work by incorporating Vision Transformer (ViT) architectures and leveraging multimodal data (e.g., clinical data, genomic information, and patient history). This could improve predictive performance and make the model more robust and adaptable to diverse data types. Ultimately, this approach has the potential to transform breast cancer detection, making it more accurate and interpretable. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

►▼ Show Figures

Figure 1

23 pages, 8920 KB

Open AccessArticle

All-Weather Forest Fire Automatic Monitoring and Early Warning Application Based on Multi-Source Remote Sensing Data: Case Study of Yunnan

by Boyang Gao, Weiwei Jia, Qiang Wang and Guang Yang

Fire 2025, 8(9), 344; https://doi.org/10.3390/fire8090344 - 27 Aug 2025

Viewed by 267

Abstract

Forest fires pose severe ecological, climatic, and socio-economic threats, destroying habitats and emitting greenhouse gases. Early and timely warning is particularly challenging because fires often originate from small-scale, low-temperature ignition sources. Traditional monitoring approaches primarily rely on single-source satellite imagery and empirical threshold algorithms, and most forest fire monitoring tasks remain human-driven. Existing frameworks have yet to effectively integrate multiple data sources and detection algorithms, lacking the capability to provide continuous, automated, and generalizable fire monitoring across diverse fire scenarios. To address these challenges, this study first improves multiple monitoring algorithms for forest fire detection, including a statistically enhanced automatic thresholding method; data augmentation to expand the U-Net deep learning dataset; and the application of a freeze–unfreeze transfer learning strategy to the U-Net transfer model. Multiple algorithms are systematically evaluated across varying fire scales, showing that the improved automatic threshold method achieves the best performance on GF-4 imagery with an F-score of 0.915 (95% CI: 0.8725–0.9524), while the U-Net deep learning algorithm yields the highest F-score of 0.921 (95% CI: 0.8537–0.9739) on Landsat 8 imagery. All methods demonstrate robust performance and generalizability across diverse scenarios. Second, data-driven scheduling technology is developed to automatically initiate preprocessing and fire detection tasks, significantly reducing fire discovery time. Finally, an integrated framework of multi-source remote sensing data, advanced detection algorithms, and a user-friendly visualization interface is proposed. This framework enables all-weather, fully automated forest fire monitoring and early warning, facilitating dynamic tracking of fire evolution and precise fire line localization through the cross-application of heterogeneous data sources. The framework’s effectiveness and practicality are validated through wildfire cases in two regions of Yunnan Province, offering scalable technical support for improving early detection of and rapid response to forest fires. Full article

(This article belongs to the Special Issue Fire Patterns, Driving Factors, and Multidimensional Impacts Under Climate Change and Human Activities)

►▼ Show Figures

Figure 1

19 pages, 5636 KB

Open AccessArticle

Complete Workflow for ER-IHC Pathology Database Revalidation

by Md Hadayet Ullah, Md Jahid Hasan, Wan Siti Halimatul Munirah Wan Ahmad, Mohammad Faizal Ahmad Fauzi, Zaka Ur Rehman, Jenny Tung Hiong Lee, See Yee Khor and Lai-Meng Looi

AI 2025, 6(9), 204; https://doi.org/10.3390/ai6090204 - 27 Aug 2025

Viewed by 818

Abstract

Computer-aided systems can assist doctors in detecting cancer at an early stage using medical image analysis. In estrogen receptor immunohistochemistry (ER-IHC)-stained whole-slide images, automated cell identification and segmentation are helpful in the prediction scoring of hormone receptor status, which aids pathologists in determining whether to recommend hormonal therapy or other therapies for a patient. Accurate scoring can be achieved with accurate segmentation and classification of the nuclei. This paper presents two main objectives: first is to identify the top three models for this classification task and establish an ensemble model, all using 10-fold cross-validation strategy; second is to detect recurring misclassifications within the dataset to identify “misclassified nuclei” or “incorrectly labeled nuclei” for the nuclei class ground truth. The classification task is carried out using 32 pre-trained deep learning models from Keras Applications, focusing on their effectiveness in classifying negative, weak, moderate, and strong nuclei in the ER-IHC histopathology images. An ensemble learning with logistic regression approach is employed for the three best models. The analysis reveals that the top three performing models are EfficientNetB0, EfficientNetV2B2, and EfficientNetB4 with an accuracy of 94.37%, 94.36%, and 94.29%, respectively, and the ensemble model’s accuracy is 95%. We also developed a web-based platform for the pathologists to rectify the “faulty-class” nuclei in the dataset. The complete flow of this work can benefit the field of medical image analysis especially when dealing with intra-observer variability with a large number of images for ground truth validation. Full article

(This article belongs to the Section Medical & Healthcare AI)

►▼ Show Figures

Figure 1

33 pages, 9021 KB

Open AccessArticle

Optimizing Urban Green Roofs: An Integrated Framework for Suitability, Economic Viability, and Microclimate Regulation

by Yuming Wu, Katsunori Furuya, Bowen Xiao and Ruochen Ma

Land 2025, 14(9), 1742; https://doi.org/10.3390/land14091742 - 27 Aug 2025

Viewed by 384

Abstract

Urban areas face significant challenges from heat islands, stormwater, and air pollution, yet green roof adoption is hindered by feasibility and economic uncertainties. This study proposes an integrated framework to optimize green roof strategies for urban sustainability. We combine deep learning for rooftop suitability screening, comprehensive ecosystem service valuation, life-cycle cost–benefit analysis under varying policy scenarios, and ENVI-met microclimate simulations across Local Climate Zones (LCZ). Using Dalian’s core urban districts as a case study, our findings reveal that all three green roof types (extensive, semi-intensive, and intensive) are economically viable when policy incentives and ecological values are fully internalized. Under the ideal scenario, intensive roofs yielded the highest long-term returns with a payback period of 4 years, while semi-intensive roofs achieved the greatest cost-effectiveness (BCR = 4.57) and the shortest payback period of 3 years; extensive roofs also reached break-even within 4 years. In contrast, under the realistic market-only scenario, only intensive roofs approached break-even with an extended payback period of 23 years, whereas extensive and semi-intensive systems remained unprofitable. Cooling performance varies by LCZ and roof type, emphasizing the critical role of urban morphology. This transferable framework provides robust data-driven decision support for green infrastructure planning and targeted policymaking in high-density urban environments. Full article

(This article belongs to the Special Issue Green Spaces and Urban Morphology: Building Sustainable Cities)

►▼ Show Figures

Figure 1

17 pages, 3628 KB

Open AccessArticle

A Unified Self-Supervised Framework for Plant Disease Detection on Laboratory and In-Field Images

by Xiaoli Huan, Bernard Chen and Hong Zhou

Electronics 2025, 14(17), 3410; https://doi.org/10.3390/electronics14173410 - 27 Aug 2025

Viewed by 274

Abstract

Early and accurate detection of plant diseases is essential for ensuring food security and maintaining sustainable agricultural productivity. However, most deep learning models for plant disease classification rely heavily on large-scale annotated datasets, which are expensive, labor-intensive, and often impractical to obtain in real-world farming environments. To address this limitation, we propose a unified self-supervised learning (SSL) framework that leverages unlabeled plant imagery to learn meaningful and transferable visual representations. Our method integrates three complementary objectives—Bootstrap Your Own Latent (BYOL), Masked Image Modeling (MIM), and contrastive learning—within a ResNet101 backbone, optimized through a hybrid loss function that captures global alignment, local structure, and instance-level distinction. GPU-based data augmentations are used to introduce stochasticity and enhance generalization during pretraining. Experimental results on the challenging PlantDoc dataset demonstrate that our model achieves an accuracy of 77.82%, with macro-averaged precision, recall, and F1-score of 80.00%, 78.24%, and 77.48%, respectively—on par with or exceeding most state-of-the-art supervised and self-supervised approaches. Furthermore, when fine-tuned on the PlantVillage dataset, the pretrained model attains 99.85% accuracy, highlighting its strong cross-domain generalization and practical transferability. These findings underscore the potential of self-supervised learning as a scalable, annotation-efficient, and robust solution for plant disease detection in real-world agricultural settings, especially where labeled data is scarce or unavailable. Full article

(This article belongs to the Special Issue Deep Learning in Video and Image Processing: Challenges, Solutions, and Future Directions)

►▼ Show Figures

Figure 1

44 pages, 3439 KB

Open AccessReview

Conventional to Deep Learning Methods for Hyperspectral Unmixing: A Review

by Jinlin Zou, Hongwei Qu and Peng Zhang

Remote Sens. 2025, 17(17), 2968; https://doi.org/10.3390/rs17172968 - 27 Aug 2025

Viewed by 486

Abstract

Hyperspectral images often contain many mixed pixels, primarily resulting from their inherent complexity and low spatial resolution. To enhance surface classification and improve sub-pixel target detection accuracy, hyperspectral unmixing technology has consistently become a topical issue. This review provides a comprehensive overview of methodologies for hyperspectral unmixing, from traditional to advanced deep learning approaches. A systematic analysis of various challenges is presented, clarifying underlying principles and evaluating the strengths and limitations of prevalent algorithms. Hyperspectral unmixing is critical for interpreting spectral imagery but faces significant challenges: limited ground-truth data, spectral variability, nonlinear mixing effects, computational demands, and barriers to practical commercialization. Future progress requires bridging the gap to applications through user-centric solutions and integrating multi-modal and multi-temporal data. Research priorities include uncertainty quantification, transfer learning for generalization, neuromorphic edge computing, and developing tuning-free foundation models for cross-scenario robustness. This paper is designed to foster the commercial application of hyperspectral unmixing algorithms and to offer robust support for engineering applications within the hyperspectral remote sensing domain. Full article

(This article belongs to the Special Issue Artificial Intelligence in Hyperspectral Remote Sensing Data Analysis)

►▼ Show Figures

Figure 1

17 pages, 1602 KB

Open AccessArticle

Deep Transfer Learning for Automatic Analysis of Ignitable Liquid Residues in Fire Debris Samples

by Ting-Yu Huang and Jorn Chi Chung Yu

Chemosensors 2025, 13(9), 320; https://doi.org/10.3390/chemosensors13090320 - 26 Aug 2025

Viewed by 279

Abstract

Interpreting chemical analysis results to identify ignitable liquid (IL) residues in fire debris samples is challenging, owing to the complex chemical composition of ILs and the diverse sample matrices. This work investigated a transfer learning approach with convolutional neural networks (CNNs), pre-trained for image recognition, to classify gas chromatography and mass spectrometry (GC/MS) data transformed into scalogram images. A small data set containing neat gasoline samples with diluted concentrations and burned Nylon carpets with varying weights was prepared to retrain six CNNs: GoogLeNet, AlexNet, SqueezeNet, VGG-16, ResNet-50, and Inception-v3. The classification tasks involved two classes: “positive of gasoline” and “negative of gasoline.” The results demonstrated that the CNNs performed very well in predicting the trained class data. When predicting untrained intra-laboratory class data, GoogLeNet had the highest accuracy (0.98 ± 0.01), precision (1.00 ± 0.01), sensitivity (0.97 ± 0.01), and specificity (1.00 ± 0.00). When predicting untrained inter-laboratory class data, GoogLeNet exhibited a sensitivity of 1.00 ± 0.00, while ResNet-50 achieved 0.94 ± 0.01 for neat gasoline. For simulated fire debris samples, both models attained sensitivities of 0.86 ± 0.02 and 0.89 ± 0.02, respectively. The new deep transfer learning approach enables automated pattern recognition in GC/MS data, facilitates high-throughput forensic analysis, and improves consistency in interpretation across various laboratories, making it a valuable tool for fire debris analysis. Full article

(This article belongs to the Special Issue GC, MS and GC-MS Analytical Methods: Opportunities and Challenges (Fourth Edition))

►▼ Show Figures

Figure 1

21 pages, 3700 KB

Open AccessArticle

Lung Sound Classification Model for On-Device AI

by Jinho Park, Chanhee Jeong, Yeonshik Choi, Hyuck-ki Hong and Youngchang Jo

Appl. Sci. 2025, 15(17), 9361; https://doi.org/10.3390/app15179361 - 26 Aug 2025

Viewed by 269

Abstract

Following the COVID-19 pandemic, public interest in healthcare has significantly in-creased, emphasizing the importance of early disease detection through lung sound analysis. Lung sounds serve as a critical biomarker in the diagnosis of pulmonary diseases, and numerous deep learning-based approaches have been actively explored for this purpose. Existing lung sound classification models have demonstrated high accuracy, benefiting from recent advances in artificial intelligence (AI) technologies. However, these models often rely on transmitting data to computationally intensive servers for processing, introducing potential security risks due to the transfer of sensitive medical information over networks. To mitigate these concerns, on-device AI has garnered growing attention as a promising solution for protecting healthcare data. On-device AI enables local data processing and inference directly on the device, thereby enhancing data security compared to server-based schemes. Despite these advantages, on-device AI is inherently limited by computational constraints, while conventional models typically require substantial processing power to maintain high performance. In this study, we propose a lightweight lung sound classification model designed specifically for on-device environments. The proposed scheme extracts audio features using Mel spectrograms, chromagrams, and Mel-Frequency Cepstral Coefficients (MFCC), which are converted into image representations and stacked to form the model input. The lightweight model performs convolution operations tailored to both temporal and frequency–domain characteristics of lung sounds. Comparative experimental results demonstrate that the proposed model achieves superior inference performance while maintaining a significantly smaller model size than conventional classification schemes, making it well-suited for deployment on resource-constrained devices. Full article

►▼ Show Figures

Figure 1

16 pages, 10806 KB

Open AccessArticle

A Rapid Segmentation Method Based on Few-Shot Learning: A Case Study on Roadways

by He Cai, Jiangchuan Chen, Yunfei Yin, Junpeng Yu and Zejiao Dong

Sensors 2025, 25(17), 5290; https://doi.org/10.3390/s25175290 - 26 Aug 2025

Viewed by 708

Abstract

Currently, deep learning-based segmentation methods are capable of achieving accurate segmentation. However, their deployment and training are costly and resource-intensive. To reduce deployment costs and facilitate the application of segmentation models for road imagery, this paper introduces a novel road segmentation algorithm based on few-shot learning. The algorithm consists of the back-projection module (BPM), responsible for generating target probabilities, and the segmentation module (SM), which performs image segmentation based on these probabilities. To achieve precise segmentation, the paper proposes a learning mechanism that simultaneously considers both positive and negative samples, effectively capturing the color features of the environment and objects. Additionally, through the workflow design, the algorithm can rapidly perform segmentation tasks across different scenarios without requiring transfer learning and with minimal sample prompts. Experimental results show that the algorithm achieves intersection over union segmentation accuracies of 94.9%, 92.7%, 94.9%, and 94.7% across different scenarios. Compared to state-of-the-art methods, it delivers precise segmentation with fewer local road image prompts, enabling efficient edge deployment. Full article

(This article belongs to the Section Sensing and Imaging)

►▼ Show Figures

Figure 1

18 pages, 2565 KB

Open AccessArticle

Rock Joint Segmentation in Drill Core Images via a Boundary-Aware Token-Mixing Network

by Seungjoo Lee, Yongjin Kim, Yongseong Kim, Jongseol Park and Bongjun Ji

Buildings 2025, 15(17), 3022; https://doi.org/10.3390/buildings15173022 - 25 Aug 2025

Viewed by 219

Abstract

The precise mapping of rock joint traces is fundamental to the design and safety assessment of foundations, retaining structures, and underground cavities in building and civil engineering. Existing deep learning approaches either impose prohibitive computational demands for on-site deployment or disrupt the topological continuity of subpixel lineaments that govern rock mass behavior. This study presents BATNet-Lite, a lightweight encoder–decoder architecture optimized for joint segmentation on resource-constrained devices. The encoder introduces a Boundary-Aware Token-Mixing (BATM) block that separates feature maps into patch tokens and directionally pooled stripe tokens, and a bidirectional attention mechanism subsequently transfers global context to local descriptors while refining stripe features, thereby capturing long-range connectivity with negligible overhead. A complementary Multi-Scale Line Enhancement (MLE) module combines depth-wise dilated and deformable convolutions to yield scale-invariant responses to joints of varying apertures. In the decoder, a Skeletal-Contrastive Decoder (SCD) employs dual heads to predict segmentation and skeleton maps simultaneously, while an InfoNCE-based contrastive loss enforces their topological consistency without requiring explicit skeleton labels. Training leverages a composite focal Tversky and edge IoU loss under a curriculum-thinning schedule, improving edge adherence and continuity. Ablation experiments confirm that BATM, MLE, and SCD each contribute substantial gains in boundary accuracy and connectivity preservation. By delivering topology-preserving joint maps with small parameters, BATNet-Lite facilitates rapid geological data acquisition for tunnel face mapping, slope inspection, and subsurface digital twin development, thereby supporting safer and more efficient building and underground engineering practice. Full article

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 62.

Go to page 1 2 3 4 5

Search Results (3,098)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI