MDPI - Publisher of Open Access Journals

35 pages, 4758 KB

Open AccessArticle

Automated Detection of Beaver-Influenced Floodplain Inundations in Multi-Temporal Aerial Imagery Using Deep Learning Algorithms

by Evan Zocco, Chandi Witharana, Isaac M. Ortega and William Ouimet

ISPRS Int. J. Geo-Inf. 2025, 14(10), 383; https://doi.org/10.3390/ijgi14100383 - 30 Sep 2025

Abstract

Remote sensing provides a viable alternative for understanding landscape modifications attributed to beaver activity. The central objective of this study is to integrate multi-source remote sensing observations in tandem with a deep learning (DL) (convolutional neural net or transformer) model to automatically map [...] Read more.

Remote sensing provides a viable alternative for understanding landscape modifications attributed to beaver activity. The central objective of this study is to integrate multi-source remote sensing observations in tandem with a deep learning (DL) (convolutional neural net or transformer) model to automatically map beaver-influenced floodplain inundations (BIFI) over large geographical extents. We trained, validated, and tested eleven different model configurations in three architectures using five ResNet and five B-Finetuned encoders. The training dataset consisted of >25,000 manually annotated aerial image tiles of BIFIs in Connecticut. The YOLOv8 architecture outperformed competing configurations and achieved an F1 score of 80.59% and pixel-based map accuracy of 98.95%. SegFormer and U-Net++’s highest-performing models had F1 scores of 68.98% and 78.86%, respectively. The YOLOv8l-seg model was deployed at a statewide scale based on 1 m resolution multi-temporal aerial imagery acquired from 1990 to 2019 under leaf-on and leaf-off conditions. Our results suggest a variety of inferences when comparing leaf-on and leaf-off conditions of the same year. The model exhibits limitations in identifying BIFIs in panchromatic imagery in occluded environments. Study findings demonstrate the potential of harnessing historical and modern aerial image datasets with state-of-the-art DL models to increase our understanding of beaver activity across space and time. Full article

16 pages, 2692 KB

Open AccessArticle

Improved UNet-Based Detection of 3D Cotton Cup Indentations and Analysis of Automatic Cutting Accuracy

by Lin Liu, Xizhao Li, Hongze Lv, Jianhuang Wang, Fucai Lai, Fangwei Zhao and Xibing Li

Processes 2025, 13(10), 3144; https://doi.org/10.3390/pr13103144 - 30 Sep 2025

Abstract

With the advancement of intelligent technology and the rise in labor costs, manual identification and cutting of 3D cotton cup indentations can no longer meet modern demands. The increasing variety and shape of 3D cotton cups due to personalized requirements make the use [...] Read more.

With the advancement of intelligent technology and the rise in labor costs, manual identification and cutting of 3D cotton cup indentations can no longer meet modern demands. The increasing variety and shape of 3D cotton cups due to personalized requirements make the use of fixed molds for cutting inefficient, leading to a large number of molds and high costs. Therefore, this paper proposes a UNet-based indentation segmentation algorithm to automatically extract 3D cotton cup indentation data. By incorporating the VGG16 network and Leaky-ReLU activation function into the UNet model, the method improves the model’s generalization capability, convergence speed, detection speed, and reduces the risk of overfitting. Additionally, attention mechanisms and an Atrous Spatial Pyramid Pooling (ASPP) module are introduced to enhance feature extraction, improving the network’s spatial feature extraction ability. Experiments conducted on a self-made 3D cotton cup dataset demonstrate a precision of 99.53%, a recall of 99.69%, a mIoU of 99.18%, and an mPA of 99.73%, meeting practical application requirements. The extracted 3D cotton cup indentation contour data is automatically input into an intelligent CNC cutting machine to cut 3D cotton cup. The cutting results of 400 data points show an 0.20 mm ± 0.42 mm error, meeting the cutting accuracy requirements for flexible material 3D cotton cups. This study may serve as a reference for machine vision, image segmentation, improvements to deep learning architectures, and automated cutting machinery for flexible materials such as fabrics. Full article

(This article belongs to the Section Automation Control Systems)

► Show Figures

Figure 1

23 pages, 347 KB

Open AccessArticle

Comparative Analysis of Foundational, Advanced, and Traditional Deep Learning Models for Hyperpolarized Gas MRI Lung Segmentation: Robust Performance in Data-Constrained Scenarios

by Ramtin Babaeipour, Matthew S. Fox, Grace Parraga and Alexei Ouriadov

Bioengineering 2025, 12(10), 1062; https://doi.org/10.3390/bioengineering12101062 - 30 Sep 2025

Abstract

This study investigates the comparative performance of foundational models, advanced large-kernel architectures, and traditional deep learning approaches for hyperpolarized gas MRI segmentation across progressive data reduction scenarios. Chronic obstructive pulmonary disease (COPD) remains a leading global health concern, and advanced imaging techniques are [...] Read more.

This study investigates the comparative performance of foundational models, advanced large-kernel architectures, and traditional deep learning approaches for hyperpolarized gas MRI segmentation across progressive data reduction scenarios. Chronic obstructive pulmonary disease (COPD) remains a leading global health concern, and advanced imaging techniques are crucial for its diagnosis and management. Hyperpolarized gas MRI, utilizing helium-3 (³He) and xenon-129 (¹²⁹Xe), offers a non-invasive way to assess lung function. We evaluated foundational models (Segment Anything Model and MedSAM), advanced architectures (UniRepLKNet and TransXNet), and traditional deep learning models (UNet with VGG19 backbone, Feature Pyramid Network with MIT-B5 backbone, and DeepLabV3 with ResNet152 backbone) using four data availability scenarios: 100%, 50%, 25%, and 10% of the full training dataset (1640 2D MRI slices from 205 participants). The results demonstrate that foundational and advanced models achieve statistically equivalent performance across all data scenarios (p > 0.01), while both significantly outperform traditional architectures under data constraints (p < 0.001). Under extreme data scarcity (10% training data), foundational and advanced models maintained DSC values above 0.86, while traditional models experienced catastrophic performance collapse. This work highlights the critical advantage of architectures with large effective receptive fields in medical imaging applications where data collection is challenging, demonstrating their potential to democratize advanced medical imaging analysis in resource-limited settings. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Medical Imaging Processing)

► Show Figures

Figure 1

15 pages, 2713 KB

Open AccessArticle

Deep Learning-Based Segmentation for Digital Epidermal Microscopic Images: A Comparative Study of Overall Performance

by Yeshun Yue, Qihang He and Yaobin Zou

Electronics 2025, 14(19), 3871; https://doi.org/10.3390/electronics14193871 - 29 Sep 2025

Abstract

Digital epidermal microscopic (DEM) images offer the potential to quantitatively analyze skin aging at the microscopic level. However, stochastic complexity, local highlights, and low contrast in DEM images pose significant challenges to accurate segmentation. This study evaluated eight deep learning models to identify [...] Read more.

Digital epidermal microscopic (DEM) images offer the potential to quantitatively analyze skin aging at the microscopic level. However, stochastic complexity, local highlights, and low contrast in DEM images pose significant challenges to accurate segmentation. This study evaluated eight deep learning models to identify methods capable of accurately segmenting complex DEM images while meeting diverse performance requirements. To this end, this study first constructed a manually labeled DEM image dataset. Then, eight deep learning models (FCN-8s, SegNet, UNet, ResUNet, NestedUNet, DeepLabV3+, TransUNet, and AttentionUNet) were systematically evaluated for their performance in DEM image segmentation. Our experimental findings show that AttentionUNet achieves the highest segmentation accuracy, with a DSC of 0.8696 and an IoU of 0.7703. In contrast, FCN-8s is a better choice for efficient segmentation due to its lower parameter count (18.64 M) and efficient inference speed (GPU time 37.36 ms). FCN-8s and NestedUNet show a better balance between accuracy and efficiency when assessed across metrics like segmentation accuracy, model size, and inference time. Through a systematic comparison of eight deep learning models, this study identifies superior methods for segmenting skin furrows and ridges in DEM images. This work lays the foundation for subsequent applications, such as analyzing skin aging through furrow and ridge features. Full article

(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)

► Show Figures

Figure 1

23 pages, 5751 KB

Open AccessArticle

Automatic Diagnosis, Classification, and Segmentation of Abdominal Aortic Aneurysm and Dissection from Computed Tomography Images

by Hakan Baltaci, Sercan Yalcin, Muhammed Yildirim and Harun Bingol

Diagnostics 2025, 15(19), 2476; https://doi.org/10.3390/diagnostics15192476 - 27 Sep 2025

Abstract

Background/Objectives: Diagnosis of abdominal aortic aneurysm and abdominal aortic dissection (AAA and AAD) is of strategic importance as cardiovascular disease has fatal implications worldwide. This study presents a novel deep learning-based approach for the accurate and efficient diagnosis of abdominal aortic aneurysms [...] Read more.

Background/Objectives: Diagnosis of abdominal aortic aneurysm and abdominal aortic dissection (AAA and AAD) is of strategic importance as cardiovascular disease has fatal implications worldwide. This study presents a novel deep learning-based approach for the accurate and efficient diagnosis of abdominal aortic aneurysms (AAAs) and aortic dissections (AADs) from CT images. Methods: Our proposed convolutional neural network (CNN) architecture effectively extracts relevant features from CT scans and classifies regions as normal or diseased. Additionally, the model accurately delineates the boundaries of detected aneurysms and dissections, aiding in clinical decision-making. A pyramid scene parsing network has been built in a hybrid method. The layer block after the classification layer is divided into two groups: whether there is an AAA or AAD region in the abdominal CT image, and determination of the borders of the detected diseased region in the medical image. Results: In this sense, both detection and segmentation are performed in AAA and AAD diseases. Python programming has been used to assess the accuracy and performance results of the proposed strategy. From the results, average accuracy rates of 83.48%, 86.9%, 88.25%, and 89.64% were achieved using ResDenseUNet, INet, C-Net, and the proposed strategy, respectively. Also, intersection over union (IoU) of 79.24%, 81.63%, 82.48%, and 83.76% have been achieved using ResDenseUNet, INet, C-Net, and the proposed method. Conclusions: The proposed strategy is a promising technique for automatically diagnosing AAA and AAD, thereby reducing the workload of cardiovascular surgeons. Full article

(This article belongs to the Special Issue Artificial Intelligence and Computational Methods in Cardiology 2025)

► Show Figures

Figure 1

27 pages, 5776 KB

Open AccessArticle

R-SWTNet: A Context-Aware U-Net-Based Framework for Segmenting Rural Roads and Alleys in China with the SQVillages Dataset

by Jianing Wu, Junqi Yang, Xiaoyu Xu, Ying Zeng, Yan Cheng, Xiaodong Liu and Hong Zhang

Land 2025, 14(10), 1930; https://doi.org/10.3390/land14101930 - 23 Sep 2025

Viewed by 101

Abstract

Rural road networks are vital for rural development, yet narrow alleys and occluded segments remain underrepresented in digital maps due to irregular morphology, spectral ambiguity, and limited model generalization. Traditional segmentation models struggle to balance local detail preservation and long-range dependency modeling, prioritizing [...] Read more.

Rural road networks are vital for rural development, yet narrow alleys and occluded segments remain underrepresented in digital maps due to irregular morphology, spectral ambiguity, and limited model generalization. Traditional segmentation models struggle to balance local detail preservation and long-range dependency modeling, prioritizing either local features or global context alone. Hypothesizing that integrating hierarchical local features and global context will mitigate these limitations, this study aims to accurately segment such rural roads by proposing R-SWTNet, a context-aware U-Net-based framework, and constructing the SQVillages dataset. R-SWTNet integrates ResNet34 for hierarchical feature extraction, Swin Transformer for long-range dependency modeling, ASPP for multi-scale context fusion, and CAM-Residual blocks for channel-wise attention. The SQVillages dataset, built from multi-source remote sensing imagery, includes 18 diverse villages with adaptive augmentation to mitigate class imbalance. Experimental results show R-SWTNet achieves a validation IoU of 54.88% and F1-score of 70.87%, outperforming U-Net and Swin-UNet, and with less overfitting than R-Net and D-LinkNet. Its lightweight variant supports edge deployment, enabling on-site road management. This work provides a data-driven tool for infrastructure planning under China’s Rural Revitalization Strategy, with potential scalability to global unstructured rural road scenes. Full article

(This article belongs to the Section Land Innovations – Data and Machine Learning)

► Show Figures

Figure 1

26 pages, 3973 KB

Open AccessArticle

ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detection

by Aditya Pal, Hari Mohan Rai, Joon Yoo, Sang-Ryong Lee and Yooheon Park

Cancers 2025, 17(18), 3005; https://doi.org/10.3390/cancers17183005 - 15 Sep 2025

Viewed by 374

Abstract

Background/Objectives: Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different [...] Read more.

Background/Objectives: Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images. Methods: The combination of the ViT’s self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details. Results: On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics. Conclusions: By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model’s interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis. Full article

(This article belongs to the Special Issue Image Analysis and Machine Learning in Cancers: 2nd Edition)

► Show Figures

Figure 1

21 pages, 4721 KB

Open AccessArticle

Automated Brain Tumor MRI Segmentation Using ARU-Net with Residual-Attention Modules

by Erdal Özbay and Feyza Altunbey Özbay

Diagnostics 2025, 15(18), 2326; https://doi.org/10.3390/diagnostics15182326 - 13 Sep 2025

Viewed by 438

Abstract

Background/Objectives: Accurate segmentation of brain tumors in Magnetic Resonance Imaging (MRI) scans is critical for diagnosis and treatment planning due to their life-threatening nature. This study aims to develop a robust and automated method capable of precisely delineating heterogeneous tumor regions while improving [...] Read more.

Background/Objectives: Accurate segmentation of brain tumors in Magnetic Resonance Imaging (MRI) scans is critical for diagnosis and treatment planning due to their life-threatening nature. This study aims to develop a robust and automated method capable of precisely delineating heterogeneous tumor regions while improving segmentation accuracy and generalization. Methods: We propose Attention Res-UNet (ARU-Net), a novel Deep Learning (DL) architecture integrating residual connections, Adaptive Channel Attention (ACA), and Dimensional-space Triplet Attention (DTA) modules. The encoding module efficiently extracts and refines relevant feature information by applying ACA to the lower layers of convolutional and residual blocks. The DTA is fixed to the upper layers of the decoding module, decoupling channel weights to better extract and fuse multi-scale features, enhancing both performance and efficiency. Input MRI images are pre-processed using Contrast Limited Adaptive Histogram Equalization (CLAHE) for contrast enhancement, denoising filters, and Linear Kuwahara filtering to preserve edges while smoothing homogeneous regions. The network is trained using categorical cross-entropy loss with the Adam optimizer on the BTMRII dataset, and comparative experiments are conducted against baseline U-Net, DenseNet121, and Xception models. Performance is evaluated using accuracy, precision, recall, F1-score, Dice Similarity Coefficient (DSC), and Intersection over Union (IoU) metrics. Results: Baseline U-Net showed significant performance gains after adding residual connections and ACA modules, with DSC improving by approximately 3.3%, accuracy by 3.2%, IoU by 7.7%, and F1-score by 3.3%. ARU-Net further enhanced segmentation performance, achieving 98.3% accuracy, 98.1% DSC, 96.3% IoU, and a superior F1-score, representing additional improvements of 1.1–2.0% over the U-Net + Residual + ACA variant. Visualizations confirmed smoother boundaries and more precise tumor contours across all six tumor classes, highlighting ARU-Net’s ability to capture heterogeneous tumor structures and fine structural details more effectively than both baseline U-Net and other conventional DL models. Conclusions: ARU-Net, combined with an effective pre-processing strategy, provides a highly reliable and precise solution for automated brain tumor segmentation. Its improvements across multiple evaluation metrics over U-Net and other conventional models highlight its potential for clinical application and contribute novel insights to medical image analysis research. Full article

(This article belongs to the Special Issue Advances in Functional and Structural MR Image Analysis)

► Show Figures

Figure 1

22 pages, 5732 KB

Open AccessArticle

Explainable Transformer-Based Framework for Glaucoma Detection from Fundus Images Using Multi-Backbone Segmentation and vCDR-Based Classification

by Hind Alasmari, Ghada Amoudi and Hanan Alghamdi

Diagnostics 2025, 15(18), 2301; https://doi.org/10.3390/diagnostics15182301 - 10 Sep 2025

Viewed by 428

Abstract

Glaucoma is an eye disease caused by increased intraocular pressure (IOP) that affects the optic nerve head (ONH), leading to vision problems and irreversible blindness. Background/Objectives: Glaucoma is the second leading cause of blindness worldwide, and the number of people affected is [...] Read more.

Glaucoma is an eye disease caused by increased intraocular pressure (IOP) that affects the optic nerve head (ONH), leading to vision problems and irreversible blindness. Background/Objectives: Glaucoma is the second leading cause of blindness worldwide, and the number of people affected is increasing each year, with the number expected to reach 111.8 million by 2040. This escalating trend is alarming due to the lack of ophthalmology specialists relative to the population. This study proposes an explainable end-to-end pipeline for automated glaucoma diagnosis from fundus images. It also evaluates the performance of Vision Transformers (ViTs) relative to traditional CNN-based models. Methods: The proposed system uses three datasets: REFUGE, ORIGA, and G1020. It begins with YOLOv11 for object detection of the optic disc. Then, the optic disc (OD) and optic cup (OC) are segmented using U-Net with ResNet50, VGG16, and MobileNetV2 backbones, as well as MaskFormer with a Swin-Base backbone. Glaucoma is classified based on the vertical cup-to-disc ratio (vCDR). Results: MaskFormer outperforms all models in segmentation in all aspects, including IoU OD, IoU OC, DSC OD, and DSC OC, with scores of 88.29%, 91.09%, 93.83%, and 93.71%. For classification, it achieved accuracy and F1-scores of 84.03% and 84.56%. Conclusions: By relying on the interpretable features of the vCDR, the proposed framework enhances transparency and aligns well with the principles of explainable AI, thus offering a trustworthy solution for glaucoma screening. Our findings show that Vision Transformers offer a promising approach for achieving high segmentation performance with explainable, biomarker-driven diagnosis. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

33 pages, 4751 KB

Open AccessArticle

U-ResNet, a Novel Network Fusion Method for Image Classification and Segmentation

by Wenkai Li, Zhe Gao and Yaqing Song

Sensors 2025, 25(17), 5600; https://doi.org/10.3390/s25175600 - 8 Sep 2025

Viewed by 703

Abstract

Image classification and segmentation are important tasks in computer vision. ResNet and U-Net are representative networks for image classification and image segmentation, respectively. Although many scholars used to fuse these two networks, most integration focuses on image segmentation with U-Net, overlooking the capabilities [...] Read more.

Image classification and segmentation are important tasks in computer vision. ResNet and U-Net are representative networks for image classification and image segmentation, respectively. Although many scholars used to fuse these two networks, most integration focuses on image segmentation with U-Net, overlooking the capabilities of ResNet for image classification. In this paper, we propose a novel U-ResNet structure by combining U-Net’s convolution–deconvolution structure (UBlock) with ResNet’s residual structure (ResBlock) in a parallel manner. This novel parallel structure achieves rapid convergence and high accuracy in image classification and segmentation while also efficiently alleviating the vanishing gradient problem. Specifically, in the UBlock, the pixel-level features of both high- and low-resolution images are extracted and processed. In the ResBlock, a Selected Upsampling (SU) module was introduced to enhance performance on low-resolution datasets, and an improved Efficient Upsampling Convolutional Block (EUCB*) with a Channel Shuffle mechanism was added before the output of the ResBlock to enhance network convergence. Features from both the ResBlock and UBlock were merged for better decision making. This architecture outperformed the state-of-the-art (SOTA) models in both image classification and segmentation tasks on open-source and private datasets. Functions of individual modules were further verified via ablation studies. The superiority of the proposed U-ResNet displays strong feasibility and potential for advanced cross-paradigm tasks in computer vision. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

26 pages, 10494 KB

Open AccessArticle

Data-Model Complexity Trade-Off in UAV-Acquired Ultra-High-Resolution Remote Sensing: Empirical Study on Photovoltaic Panel Segmentation

by Zhigang Zou, Xinhui Zhou, Pukaiyuan Yang, Jingyi Liu and Wu Yang

Drones 2025, 9(9), 619; https://doi.org/10.3390/drones9090619 - 3 Sep 2025

Viewed by 363

Abstract

With the growing adoption of deep learning in remote sensing, the increasing diversity of models and datasets has made method selection and experimentation more challenging, especially for non-expert users. This study presents a comprehensive evaluation of photovoltaic panel segmentation using a large-scale ultra-high-resolution [...] Read more.

With the growing adoption of deep learning in remote sensing, the increasing diversity of models and datasets has made method selection and experimentation more challenging, especially for non-expert users. This study presents a comprehensive evaluation of photovoltaic panel segmentation using a large-scale ultra-high-resolution benchmark of over 25,000 manually annotated unmanned aerial vehicle image patches, systematically quantifying the impact of model and data characteristics. Our results indicate that increasing the spatial diversity of training data has a more substantial impact on training stability and segmentation accuracy than simply adding spectral bands or enlarging the dataset volume. Across all experimental settings, moderate-sized models (DeepLabV3_50, ResUNet50, and SegFormer B4) often provided the best trade-off between segmentation performance and computational efficiency, achieving an average Intersection over Union (IoU) of 0.8966 comparable to 0.8970 of larger models. Moreover, model architecture plays a more critical role than model size; as the ResUNet models consistently achieved higher mean IoU than both DeepLabV3 and SegFormer models, with average improvements of 0.047 and 0.143, respectively. Our findings offer quantitative guidance for balancing architectural choices, model complexity, and dataset design, ultimately promoting more robust and efficient deployment of deep learning models in high-resolution remote sensing applications. Full article

► Show Figures

Figure 1

25 pages, 3974 KB

Open AccessFeature PaperArticle

Modular Deep-Learning Pipelines for Dental Caries Data Streams: A Twin-Cohort Proof-of-Concept

by Ștefan Lucian Burlea, Călin Gheorghe Buzea, Florin Nedeff, Diana Mirilă, Valentin Nedeff, Maricel Agop, Dragoș Ioan Rusu and Laura Elisabeta Checheriță

Dent. J. 2025, 13(9), 402; https://doi.org/10.3390/dj13090402 - 2 Sep 2025

Viewed by 472

Abstract

Background: Dental caries arise from a multifactorial interplay between microbial dysbiosis, host immune responses, and enamel degradation visible on radiographs. Deep learning excels in image-based caries detection; however, integrative analyses that combine radiographic, microbiome, and transcriptomic data remain rare because public cohorts are [...] Read more.

Background: Dental caries arise from a multifactorial interplay between microbial dysbiosis, host immune responses, and enamel degradation visible on radiographs. Deep learning excels in image-based caries detection; however, integrative analyses that combine radiographic, microbiome, and transcriptomic data remain rare because public cohorts are seldom aligned. Objective: To determine whether three independent deep-learning pipelines—radiographic segmentation, microbiome regression, and transcriptome regression—can be reproducible implemented on non-aligned datasets, and to demonstrate the feasibility of estimating microbiome heritability in a matched twin cohort. Methods: (i) A U-Net with ResNet-18 encoder was trained on 100 annotated panoramic radiographs to generate a continuous caries-severity score from a predicted lesion area. (ii) Feed-forward neural networks (FNNs) were trained on supragingival 16S rRNA profiles (81 samples, 750 taxa) and gingival transcriptomes (247 samples, 54,675 probes) using randomly permuted severity scores as synthetic targets to stress-test preprocessing, training, and SHAP-based interpretability. (iii) In 49 monozygotic and 50 dizygotic twin pairs (n = 198), Bray–Curtis dissimilarity quantified microbial heritability, and an FNN was trained to predict recorded TotalCaries counts. Results: The U-Net achieved IoU = 0.564 (95% CI 0.535–0.594), precision = 0.624 (95% CI 0.583–0.667), recall = 0.877 (95% CI 0.827–0.918), and correlated with manual severity scores (r = 0.62, p < 0.01). The synthetic-target FNNs converged consistently but—as intended—showed no predictive power (R² ≈ −0.15 microbiome; −0.18 transcriptome). Twin analysis revealed greater microbiome similarity in monozygotic versus dizygotic pairs (0.475 ± 0.107 vs. 0.557 ± 0.117; p = 0.0005) and a modest correlation between salivary features and caries burden (r = 0.25). Conclusions: Modular deep-learning pipelines remain computationally robust and interpretable on non-aligned datasets; radiographic severity provides a transferable quantitative anchor. Twin-cohort findings confirm heritable patterns in the oral microbiome and outline a pathway toward future clinical translation once patient-matched multi-omics are available. This framework establishes a scalable, reproducible foundation for integrative caries research. Full article

► Show Figures

Figure 1

39 pages, 11915 KB

Open AccessArticle

Enhancing a Building Change Detection Model in Remote Sensing Imagery for Encroachments and Construction on Government Lands in Egypt as a Case Study

by Essam Mohamed AbdElhamied, Sherin Moustafa Youssef, Marwa Ali ElShenawy and Gouda Ismail Salama

Appl. Sci. 2025, 15(17), 9407; https://doi.org/10.3390/app15179407 - 27 Aug 2025

Viewed by 479

Abstract

Change detection (CD) in optical remote-sensing images is a critical task for applications such as urban planning, disaster monitoring, and environmental assessment. While UNet-based architecture has demonstrated strong performance in CD tasks, it often struggles with capturing deep hierarchical features due to the [...] Read more.

Change detection (CD) in optical remote-sensing images is a critical task for applications such as urban planning, disaster monitoring, and environmental assessment. While UNet-based architecture has demonstrated strong performance in CD tasks, it often struggles with capturing deep hierarchical features due to the limitations of plain convolutional layers. Conversely, ResNet architectures excel at learning deep features through residual connections but may lack precise localization capabilities. To address these challenges, we propose ResUNet++, a novel hybrid architecture that combines the strengths of ResNet and UNet for accurate and robust change detection. ResUNet++ integrates residual blocks into the UNet framework to enhance feature representation and mitigate gradient vanishing problems. Additionally, we introduce a Multi-Scale Feature Fusion (MSFF) module to aggregate features at different scales, improving the detection of both large and small changes. Experimental results on multiple datasets (EGY-CD, S2Looking, and LEVIR-CD) demonstrate that ResUNet++ outperforms state-of-the-art methods, achieving higher precision, recall, and F1-scores while maintaining computational efficiency. Full article

► Show Figures

Figure 1

21 pages, 3725 KB

Open AccessArticle

Pruning-Friendly RGB-T Semantic Segmentation for Real-Time Processing on Edge Devices

by Jun Young Hwang, Youn Joo Lee, Ho Gi Jung and Jae Kyu Suhr

Electronics 2025, 14(17), 3408; https://doi.org/10.3390/electronics14173408 - 27 Aug 2025

Viewed by 463

Abstract

RGB-T semantic segmentation using thermal and RGB images simultaneously is actively being researched to robustly recognize the surrounding environment of vehicles regardless of challenging lighting and weather conditions. It is important for them to operate in real time on edge devices. As transformer-based [...] Read more.

RGB-T semantic segmentation using thermal and RGB images simultaneously is actively being researched to robustly recognize the surrounding environment of vehicles regardless of challenging lighting and weather conditions. It is important for them to operate in real time on edge devices. As transformer-based approaches, which most recent RGB-T semantic segmentation studies belong to, are very difficult to perform on edge devices, this paper considers only CNN-based RGB-T semantic segmentation networks that can be performed on edge devices and operated in real time. Although EAEFNet shows the best performance among CNN-based networks on edge devices, its inference speed is too slow for real-time operation. Furthermore, even when channel pruning is applied, the speed improvement is minimal. The analysis of EAEFNet identifies the intermediate fusion of RGB and thermal features and the high complexity of the decoder as the main causes. To address these issues, this paper proposes a network using a ResNet encoder with an early-fused four-channel input and the U-Net decoder structure. To improve the decoder performance, bilinear upsampling is replaced with PixelShuffle. Additionally, mini Atrous Spatial Pyramid Pooling (ASPP) and Progressive Transposed Module (PTM) modules are applied. Since the Proposed Network is primarily composed of convolutional layers, channel pruning is confirmed to be effectively applicable. Consequently, channel pruning significantly improves inference speed, and enables real-time operation on the neural processing unit (NPU) of edge devices. The Proposed Network is evaluated using the MFNet dataset, one of the most widely used public datasets for RGB-T semantic segmentation. It is shown that the proposed method achieves a performance comparable to EAEFNet while operating at over 30 FPS on an embedded board equipped with the Qualcomm QCS6490 SoC. Full article

(This article belongs to the Special Issue New Insights in 2D and 3D Object Detection and Semantic Segmentation)

► Show Figures

Figure 1

16 pages, 9579 KB

Open AccessArticle

Video-Based Deep Learning Approach for Water Level Monitoring in Reservoirs

by Wallpyo Jung, Jongchan Kim, Hyeontak Jo, Seungyub Lee and Byunghyun Kim

Water 2025, 17(17), 2525; https://doi.org/10.3390/w17172525 - 25 Aug 2025

Viewed by 998

Abstract

This study developed a deep learning–based water level recognition model using Closed-Circuit Television (CCTV) footage. The model focuses on real-time water level recognition in agricultural reservoirs that lack automated water level gauges, with the potential for future extension to flood forecasting applications. Video [...] Read more.

This study developed a deep learning–based water level recognition model using Closed-Circuit Television (CCTV) footage. The model focuses on real-time water level recognition in agricultural reservoirs that lack automated water level gauges, with the potential for future extension to flood forecasting applications. Video data collected over approximately two years at the Myeonggyeong Reservoir in Chungcheongbuk-do, South Korea, were utilized. A semantic segmentation approach using the U-Net model was employed to extract water surface areas, followed by the classification of water levels using Convolutional Neural Network (CNN), ResNet, and EfficientNet models. To improve learning efficiency, water level intervals were defined using both equal spacing and the Jenks natural breaks classification method. Among the models, EfficientNet achieved the highest performance with an accuracy of approximately 99%, while ResNet also demonstrated stable learning outcomes. In contrast, CNN showed faster initial convergence but lower accuracy in classifying complex intervals. This study confirms the feasibility of applying vision-based water level prediction technology to flood-prone agricultural reservoirs. Future work will focus on enhancing system performance through low-light video correction, multi-sensor integration, and model optimization using AutoML, thereby contributing to the development of an intelligent, flood-resilient water resource management system. Full article

(This article belongs to the Special Issue Machine Learning Methods for Flood Computation)

► Show Figures

Figure 1

Search Results (451)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (451)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI