MDPI - Publisher of Open Access Journals

15 pages, 4650 KB

Open AccessArticle

Rapid Discrimination of Platycodonis radix Geographical Origins Using Hyperspectral Imaging and Deep Learning

by Weihang Xing, Xuquan Wang, Zhiyuan Ma, Yujie Xing, Xiong Dun and Xinbin Cheng

Optics 2025, 6(4), 52; https://doi.org/10.3390/opt6040052 - 13 Oct 2025

Viewed by 126

Platycodonis radix is a commonly used traditional Chinese medicine (TCM) material. Its bioactive compounds and medicinal value are closely related to its geographical origin. The internal components of Platycodonis radix from different origins are different due to the influence of environmental factors such [...] Read more.

Platycodonis radix is a commonly used traditional Chinese medicine (TCM) material. Its bioactive compounds and medicinal value are closely related to its geographical origin. The internal components of Platycodonis radix from different origins are different due to the influence of environmental factors such as soil and climate. These differences can affect the medicinal value. Therefore, accurate identification of Platycodonis radix origin is crucial for drug safety and scientific research. Traditional methods of identification of TCM materials, such as morphological identification and physicochemical analysis, cannot meet the efficiency requirements. Although emerging technologies such as computer vision and spectroscopy can achieve rapid detection, their accuracy in identifying the origin of Platycodonis radix is limited when relying solely on RGB images or spectral features. To solve this problem, we aim to develop a rapid, non-destructive, and accurate method for origin identification of Platycodonis radix using hyperspectral imaging (HSI) combined with deep learning. We captured hyperspectral images of Platycodonis radix slices in 400–1000 nm range, and proposed a deep learning classification model based on these images. Our model uses one-dimensional (1D) convolution kernels to extract spectral features and two-dimensional (2D) convolution kernels to extract spatial features, fully utilizing the hyperspectral data. The average accuracy has reached 96.2%, significantly better than that of 49.0% based on RGB images and 81.8% based on spectral features in 400–1000 nm range. Furthermore, based on hyperspectral images, our model’s accuracy is 14.6%, 8.4%, and 9.6% higher than the variants of VGG, ResNet, and GoogLeNet, respectively. These results not only demonstrate the advantages of HSI in identifying the origin of Platycodonis radix, but also demonstrate the advantages of combining 1D convolution and 2D convolution in hyperspectral image classification. Full article

► Show Figures

Figure 1

27 pages, 9738 KB

Open AccessArticle

Machine Learning Recognition and Phase Velocity Estimation of Atmospheric Gravity Waves from OI 557.7 nm All-Sky Airglow Images

by Rady Mahmoud, Moataz Abdelwahab, Kazuo Shiokawa and Ayman Mahrous

AI 2025, 6(10), 262; https://doi.org/10.3390/ai6100262 - 7 Oct 2025

Viewed by 499

Abstract

Atmospheric gravity waves (AGWs) are treated as density structure perturbations of the atmosphere and play an important role in atmospheric dynamics. Utilizing All-Sky Airglow Imagers (ASAIs) with OI-Filter 557.7 nm, AGW phase velocity and propagation direction were extracted using classified images by visual [...] Read more.

Atmospheric gravity waves (AGWs) are treated as density structure perturbations of the atmosphere and play an important role in atmospheric dynamics. Utilizing All-Sky Airglow Imagers (ASAIs) with OI-Filter 557.7 nm, AGW phase velocity and propagation direction were extracted using classified images by visual inspection, where airglow images were collected from the OMTI network at Shigaraki (34.85 E, 134.11 N) from October 1998 to October 2002. Nonetheless, a large dataset of airglow images are processed and classified for studying AGW seasonal variation in the middle atmosphere. In this article, a machine learning-based approach for image recognition of AGWs from ASAIs is suggested. Consequently, three convolutional neural networks (CNNs), namely AlexNet, GoogLeNet, and ResNet-50, are considered. Out of 13,201 deviated images, 1192 very weak/unclear AGW signatures were eliminated during the quality control process. All networks were trained and tested by 12,007 classified images which approximately cover the maximum solar cycle during the time-period mentioned above. In the testing phase, AlexNet achieved the highest accuracy of 98.41%. Consequently, estimation of AGW zonal and meridional phase velocities in the mesosphere region by a cascade forward neural network (CFNN) is presented. The CFNN was trained and tested based on AGW and neutral wind data. AGW data were extracted from the classified AGW images by event and spectral methods, where wind data were extracted from the Horizontal Wind Model (HWM) as well as the middle and upper atmosphere radar in Shigaraki. As a result, the estimated phase velocities were determined with correlation coefficient (R) above 0.89 in all training and testing phases. Finally, a comparison with the existing studies confirms the accuracy of our proposed approaches in addition to AGW velocity forecasting. Full article

► Show Figures

Figure 1

20 pages, 620 KB

Open AccessArticle

Discriminative Regions and Adversarial Sensitivity in CNN-Based Malware Image Classification

by Anish Roy and Fabio Di Troia

Electronics 2025, 14(19), 3937; https://doi.org/10.3390/electronics14193937 - 4 Oct 2025

Viewed by 412

Abstract

The escalating prevalence of malware poses a significant threat to digital infrastructure, demanding robust yet efficient detection methods. In this study, we evaluate multiple Convolutional Neural Network (CNN) architectures, including basic CNN, LeNet, AlexNet, GoogLeNet, and DenseNet, on a dataset of 11,000 malware [...] Read more.

The escalating prevalence of malware poses a significant threat to digital infrastructure, demanding robust yet efficient detection methods. In this study, we evaluate multiple Convolutional Neural Network (CNN) architectures, including basic CNN, LeNet, AlexNet, GoogLeNet, and DenseNet, on a dataset of 11,000 malware images spanning 452 families. Our experiments demonstrate that CNN models can achieve reliable classification performance across both multiclass and binary tasks. However, we also uncover a critical weakness in that even minimal image perturbations, such as pixel modification lower than 1% of the total image pixels, drastically degrade accuracy and reveal CNNs’ fragility in adversarial settings. A key contribution of this work is spatial analysis of malware images, revealing that discriminative features concentrate disproportionately in the bottom-left quadrant. This spatial bias likely reflects semantic structure, as malware payload information often resides near the end of binary files when rasterized. Notably, models trained in this region outperform those trained in other sections, underscoring the importance of spatial awareness in malware classification. Taken together, our results reveal that CNN-based malware classifiers are simultaneously effective and vulnerable to learning strong representations but sensitive to both subtle perturbations and positional bias. These findings highlight the need for future detection systems that integrate robustness to noise with resilience against spatial distortions to ensure reliability in real-world adversarial environments. Full article

(This article belongs to the Special Issue AI and Cybersecurity: Emerging Trends and Key Challenges)

► Show Figures

Figure 1

18 pages, 2980 KB

Open AccessArticle

Deep Learning-Based Identification of Kazakhstan Apple Varieties Using Pre-Trained CNN Models

by Jakhfer Alikhanov, Tsvetelina Georgieva, Eleonora Nedelcheva, Aidar Moldazhanov, Akmaral Kulmakhambetova, Dmitriy Zinchenko, Alisher Nurtuleuov, Zhandos Shynybay and Plamen Daskalov

AgriEngineering 2025, 7(10), 331; https://doi.org/10.3390/agriengineering7100331 - 1 Oct 2025

Viewed by 413

Abstract

This paper presents a digital approach for the identification of apple varieties bred in Kazakhstan using deep learning methods and transfer learning. The main objective of this study is to develop and evaluate an algorithm for automatic varietal classification of apples based on [...] Read more.

This paper presents a digital approach for the identification of apple varieties bred in Kazakhstan using deep learning methods and transfer learning. The main objective of this study is to develop and evaluate an algorithm for automatic varietal classification of apples based on color images obtained under controlled conditions. Five representative cultivars were selected as research objects: Aport Alexander, Ainur, Sinap Almaty, Nursat, and Kazakhskij Yubilejnyj. The fruit samples were collected in the pomological garden of the Kazakh Research Institute of Fruit and Vegetable Growing, ensuring representativeness and taking into account the natural variability of the cultivars. Two convolutional neural network (CNN) architectures—GoogLeNet and SqueezeNet—were fine-tuned using transfer learning with different optimization settings. The data processing pipeline included preprocessing, training and validation set formation, and augmentation techniques to improve model generalization. Network performance was assessed using standard evaluation metrics such as accuracy, precision, and recall, complemented by confusion matrix analysis to reveal potential misclassifications. The results demonstrated high recognition efficiency: the classification accuracy exceeded 95% for most cultivars, while the Ainur variety achieved 100% recognition when tested with GoogLeNet. Interestingly, the Nursat variety achieved the best results with SqueezeNet, which highlights the importance of model selection for specific apple types. These findings confirm the applicability of CNN-based deep learning for varietal recognition of Kazakhstan apple cultivars. The novelty of this study lies in applying neural network models to local Kazakhstan apple varieties for the first time, which is of both scientific and practical importance. The practical contribution of the research is the potential integration of the developed method into industrial fruit-sorting systems, thereby increasing productivity, objectivity, and precision in post-harvest processing. The main limitation of this study is the relatively small dataset and the use of controlled laboratory image acquisition conditions. Future research will focus on expanding the dataset, testing the models under real production environments, and exploring more advanced deep learning architectures to further improve recognition performance. Full article

(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)

► Show Figures

Figure 1

17 pages, 2347 KB

Open AccessArticle

A Convolutional Neural Network-Based Vehicle Security Enhancement Model: A South African Case Study

by Thapelo Samuel Matlala, Michael Moeti, Khuliso Sigama and Relebogile Langa

Appl. Sci. 2025, 15(19), 10584; https://doi.org/10.3390/app151910584 - 30 Sep 2025

Viewed by 235

Abstract

This paper applies a Convolutional Neural Network (CNN)-based vehicle security enhancement model, with a specific focus on the South African context. While conventional security systems, including immobilizers, alarms, steering locks, and GPS trackers, provide a baseline level of protection, they are increasingly being [...] Read more.

This paper applies a Convolutional Neural Network (CNN)-based vehicle security enhancement model, with a specific focus on the South African context. While conventional security systems, including immobilizers, alarms, steering locks, and GPS trackers, provide a baseline level of protection, they are increasingly being circumvented by technologically adept adversaries. These limitations have spurred the development of advanced security solutions leveraging artificial intelligence (AI), with a particular emphasis on computer vision and deep learning techniques. This paper presents a CNN-based Vehicle Security Enhancement Model (CNN-based VSEM) that integrates facial recognition with GSM and GPS technologies to provide a robust, real-time security solution in South Africa. This study contributes a novel integration of CNN-based authentication with GSM and GPS tracking in the South African context, validated on a functional prototype.The prototype, developed on a Raspberry Pi 4 platform, was validated through practical demonstrations and user evaluations. The system achieved an average recognition accuracy of 85.9%, with some identities reaching 100% classification accuracy. While misclassifications led to an estimated False Acceptance Rate (FAR) of ~5% and False Rejection Rate (FRR) of ~12%, the model consistently enabled secure authentication. Preliminary latency tests indicated a decision time of approximately 1.8 s from image capture to ignition authorization. These results, together with positive user feedback, confirm the model’s feasibility and reliability. This integrated approach presents a promising advancement in intelligent vehicle security for regions with high rates of vehicle theft. Future enhancements will explore the incorporation of 3D sensing, infrared imaging, and facial recognition capable of handling variations in facial appearance. Additionally, the model is designed to detect authorized users, identify suspicious behaviour in the vicinity of the vehicle, and provide an added layer of protection against unauthorized access. Full article

(This article belongs to the Special Issue Machine Learning-Based Internet of Vehicles and Internet of Things Systems)

► Show Figures

Figure 1

17 pages, 2525 KB

Open AccessArticle

A Non-Destructive Deep Learning–Based Method for Shrimp Freshness Assessment in Food Processing

by Dongyu Hao, Cunxi Zhang, Rui Wang, Qian Qiao, Linsong Gao, Jin Liu and Rongsheng Lin

Processes 2025, 13(9), 2895; https://doi.org/10.3390/pr13092895 - 10 Sep 2025

Viewed by 500

Abstract

Maintaining the freshness of shrimp is a critical issue in quality and safety control within the food processing industry. Traditional methods often rely on destructive techniques, which are difficult to apply in online real-time monitoring. To address this challenge, this study aims to [...] Read more.

Maintaining the freshness of shrimp is a critical issue in quality and safety control within the food processing industry. Traditional methods often rely on destructive techniques, which are difficult to apply in online real-time monitoring. To address this challenge, this study aims to propose a non-destructive approach for shrimp freshness assessment based on imaging and deep learning, enabling efficient and reliable freshness classification. The core innovation of the method lies in constructing an improved GoogLeNet architecture. By incorporating the ELU activation function, L2 regularization, and the RMSProp optimizer, combined with a transfer learning strategy, the model effectively enhances generalization capability and stability under limited sample conditions. Evaluated on a shrimp image dataset rigorously annotated based on TVB-N reference values, the proposed model achieved an accuracy of 93% with a test loss of only 0.2. Ablation studies further confirmed the contribution of architectural and training strategy modifications to performance improvement. The results demonstrate that the method enables rapid, non-contact freshness discrimination, making it suitable for real-time sorting and quality monitoring in shrimp processing lines, and providing a feasible pathway for deployment on edge computing devices. This study offers a practical solution for intelligent non-destructive detection in aquatic products, with strong potential for engineering applications. Full article

(This article belongs to the Section Food Process Engineering)

► Show Figures

Figure 1

27 pages, 1902 KB

Open AccessArticle

Few-Shot Breast Cancer Diagnosis Using a Siamese Neural Network Framework and Triplet-Based Loss

by Tea Marasović and Vladan Papić

Algorithms 2025, 18(9), 567; https://doi.org/10.3390/a18090567 - 8 Sep 2025

Viewed by 542

Abstract

Breast cancer is one of the leading causes of death among women of all ages and backgrounds globally. In recent years, the growing deficit of expert radiologists—particularly in underdeveloped countries—alongside a surge in the number of images for analysis, has negatively affected the [...] Read more.

Breast cancer is one of the leading causes of death among women of all ages and backgrounds globally. In recent years, the growing deficit of expert radiologists—particularly in underdeveloped countries—alongside a surge in the number of images for analysis, has negatively affected the ability to secure timely and precise diagnostic results in breast cancer screening. AI technologies offer powerful tools that allow for the effective diagnosis and survival forecasting, reducing the dependency on human cognitive input. Towards this aim, this research introduces a deep meta-learning framework for swift analysis of mammography images—combining a Siamese network model with a triplet-based loss function—to facilitate automatic screening (recognition) of potentially suspicious breast cancer cases. Three pre-trained deep CNN architectures, namely GoogLeNet, ResNet50, and MobileNetV3, are fine-tuned and scrutinized for their effectiveness in transforming input mammograms to a suitable embedding space. The proposed framework undergoes a comprehensive evaluation through a rigorous series of experiments, utilizing two different, publicly accessible, and widely used datasets of digital X-ray mammograms: INbreast and CBIS-DDSM. The experimental results demonstrate the framework’s strong performance in differentiating between tumorous and normal images, even with a very limited number of training samples, on both datasets. Full article

(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))

► Show Figures

Figure 1

17 pages, 1602 KB

Open AccessArticle

Deep Transfer Learning for Automatic Analysis of Ignitable Liquid Residues in Fire Debris Samples

by Ting-Yu Huang and Jorn Chi Chung Yu

Chemosensors 2025, 13(9), 320; https://doi.org/10.3390/chemosensors13090320 - 26 Aug 2025

Viewed by 734

Abstract

Interpreting chemical analysis results to identify ignitable liquid (IL) residues in fire debris samples is challenging, owing to the complex chemical composition of ILs and the diverse sample matrices. This work investigated a transfer learning approach with convolutional neural networks (CNNs), pre-trained for [...] Read more.

Interpreting chemical analysis results to identify ignitable liquid (IL) residues in fire debris samples is challenging, owing to the complex chemical composition of ILs and the diverse sample matrices. This work investigated a transfer learning approach with convolutional neural networks (CNNs), pre-trained for image recognition, to classify gas chromatography and mass spectrometry (GC/MS) data transformed into scalogram images. A small data set containing neat gasoline samples with diluted concentrations and burned Nylon carpets with varying weights was prepared to retrain six CNNs: GoogLeNet, AlexNet, SqueezeNet, VGG-16, ResNet-50, and Inception-v3. The classification tasks involved two classes: “positive of gasoline” and “negative of gasoline.” The results demonstrated that the CNNs performed very well in predicting the trained class data. When predicting untrained intra-laboratory class data, GoogLeNet had the highest accuracy (0.98 ± 0.01), precision (1.00 ± 0.01), sensitivity (0.97 ± 0.01), and specificity (1.00 ± 0.00). When predicting untrained inter-laboratory class data, GoogLeNet exhibited a sensitivity of 1.00 ± 0.00, while ResNet-50 achieved 0.94 ± 0.01 for neat gasoline. For simulated fire debris samples, both models attained sensitivities of 0.86 ± 0.02 and 0.89 ± 0.02, respectively. The new deep transfer learning approach enables automated pattern recognition in GC/MS data, facilitates high-throughput forensic analysis, and improves consistency in interpretation across various laboratories, making it a valuable tool for fire debris analysis. Full article

(This article belongs to the Special Issue GC, MS and GC-MS Analytical Methods: Opportunities and Challenges (Fourth Edition))

► Show Figures

Figure 1

26 pages, 6425 KB

Open AccessArticle

Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations

by Pafan Doungpaisan and Peerapol Khunarsa

J. Imaging 2025, 11(8), 281; https://doi.org/10.3390/jimaging11080281 - 21 Aug 2025

Viewed by 918

Abstract

Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time–frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, [...] Read more.

Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time–frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, Reassigned, Chroma, Spectral Contrast, and Wavelet. The dataset consists of 2148 gunshot recordings from four firearm types, collected in a semi-controlled outdoor environment under multi-orientation conditions. To leverage advanced computer vision techniques, all spectrograms were converted into RGB images using perceptually informed colormaps. This enabled the application of image processing approaches and fine-tuning of pre-trained Convolutional Neural Networks (CNNs) originally developed for natural image classification. Six CNN architectures—ResNet18, ResNet50, ResNet101, GoogLeNet, Inception-v3, and InceptionResNetV2—were trained on these spectrogram images. Experimental results indicate that CQT, Cochleagram, and Mel spectrograms consistently achieved high classification accuracy, exceeding 94% when paired with deep CNNs such as ResNet101 and InceptionResNetV2. These findings demonstrate that transforming time–frequency features into RGB images not only facilitates the use of image-based processing but also allows deep models to capture rich spectral–temporal patterns, providing a robust framework for accurate firearm sound classification. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

23 pages, 1938 KB

Open AccessArticle

Algorithmic Silver Trading via Fine-Tuned CNN-Based Image Classification and Relative Strength Index-Guided Price Direction Prediction

by Yahya Altuntaş, Fatih Okumuş and Adnan Fatih Kocamaz

Symmetry 2025, 17(8), 1338; https://doi.org/10.3390/sym17081338 - 16 Aug 2025

Viewed by 1165

Abstract

Predicting short-term buy and sell signals in financial markets remains a significant challenge for algorithmic trading. This difficulty stems from the data’s inherent volatility and noise, which often leads to spurious signals and poor trading performance. This paper presents a novel algorithmic trading [...] Read more.

Predicting short-term buy and sell signals in financial markets remains a significant challenge for algorithmic trading. This difficulty stems from the data’s inherent volatility and noise, which often leads to spurious signals and poor trading performance. This paper presents a novel algorithmic trading model for silver that combines fine-tuned Convolutional Neural Networks (CNNs) with a decision filter based on the Relative Strength Index (RSI). The technique allows for the prediction of buy and sell points by turning time series data into chart images. Daily silver price per ounce data were turned into chart images using technical analysis indicators. Four pre-trained CNNs, namely AlexNet, VGG16, GoogLeNet, and ResNet-50, were fine-tuned using the generated image dataset to find the best architecture based on classification and financial performance. The models were evaluated using walk-forward validation with an expanding window. This validation method made the tests more realistic and the performance evaluation more robust under different market conditions. Fine-tuned VGG16 with the RSI filter had the best cost-adjusted profitability, with a cumulative return of 115.03% over five years. This was nearly double the 61.62% return of a buy-and-hold strategy. This outperformance is especially impressive because the evaluation period was mostly upward, which makes it harder to beat passive benchmarks. Adding the RSI filter also helped models make more disciplined decisions. This reduced transactions with low confidence. In general, the results show that pre-trained CNNs fine-tuned on visual representations, when supplemented with domain-specific heuristics, can provide strong and cost-effective solutions for algorithmic trading, even when realistic cost assumptions are used. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

28 pages, 6624 KB

Open AccessArticle

YoloMal-XAI: Interpretable Android Malware Classification Using RGB Images and YOLO11

by Chaymae El Youssofi and Khalid Chougdali

J. Cybersecur. Priv. 2025, 5(3), 52; https://doi.org/10.3390/jcp5030052 - 1 Aug 2025

Viewed by 1231

Abstract

As Android malware grows increasingly sophisticated, traditional detection methods struggle to keep pace, creating an urgent need for robust, interpretable, and real-time solutions to safeguard mobile ecosystems. This study introduces YoloMal-XAI, a novel deep learning framework that transforms Android application files into RGB [...] Read more.

As Android malware grows increasingly sophisticated, traditional detection methods struggle to keep pace, creating an urgent need for robust, interpretable, and real-time solutions to safeguard mobile ecosystems. This study introduces YoloMal-XAI, a novel deep learning framework that transforms Android application files into RGB image representations by mapping DEX (Dalvik Executable), Manifest.xml, and Resources.arsc files to distinct color channels. Evaluated on the CICMalDroid2020 dataset using YOLO11 pretrained classification models, YoloMal-XAI achieves 99.87% accuracy in binary classification and 99.56% in multi-class classification (Adware, Banking, Riskware, SMS, and Benign). Compared to ResNet-50, GoogLeNet, and MobileNetV2, YOLO11 offers competitive accuracy with at least 7× faster training over 100 epochs. Against YOLOv8, YOLO11 achieves comparable or superior accuracy while reducing training time by up to 3.5×. Cross-corpus validation using Drebin and CICAndMal2017 further confirms the model’s generalization capability on previously unseen malware. An ablation study highlights the value of integrating DEX, Manifest, and Resources components, with the full RGB configuration consistently delivering the best performance. Explainable AI (XAI) techniques—Grad-CAM, Grad-CAM++, Eigen-CAM, and HiRes-CAM—are employed to interpret model decisions, revealing the DEX segment as the most influential component. These results establish YoloMal-XAI as a scalable, efficient, and interpretable framework for Android malware detection, with strong potential for future deployment on resource-constrained mobile devices. Full article

► Show Figures

Figure 1

19 pages, 4037 KB

Open AccessArticle

A Rolling Bearing Fault Diagnosis Method Based on Wild Horse Optimizer-Enhanced VMD and Improved GoogLeNet

by Xiaoliang He, Feng Zhao, Nianyun Song, Zepeng Liu and Libing Cao

Sensors 2025, 25(14), 4421; https://doi.org/10.3390/s25144421 - 16 Jul 2025

Viewed by 532

Abstract

To address the challenges of weak fault features and strong non-stationarity in early-stage vibration signals, this study proposes a novel fault diagnosis method combining enhanced variational mode decomposition (VMD) with a structurally improved GoogLeNet. Specifically, an improved wild horse optimizer (IWHO) with tent [...] Read more.

To address the challenges of weak fault features and strong non-stationarity in early-stage vibration signals, this study proposes a novel fault diagnosis method combining enhanced variational mode decomposition (VMD) with a structurally improved GoogLeNet. Specifically, an improved wild horse optimizer (IWHO) with tent chaotic mapping is employed to automatically optimize critical VMD parameters, including the number of modes K and the penalty factor α, enabling precise decomposition of non-stationary signals to extract weak fault features. The vibration signal is decomposed, and the top five intrinsic mode functions (IMFs) are selected based on the kurtosis criterion. Time–frequency features are then extracted from these IMFs and input into a modified GoogLeNet classifier. The GoogLeNet structure is improved by replacing standard n × n convolution kernels with cascaded 1 × n and n × 1 kernels, and by substituting the ReLU activation function with a parameterized TReLU function to enhance adaptability and convergence. Experimental results on two public rolling bearing datasets demonstrate that the proposed method effectively handles non-stationary signals, achieving 99.17% accuracy across four fault types and maintaining over 95.80% accuracy under noisy conditions. Full article

(This article belongs to the Special Issue Fatigue-Sensing Technologies for Manufacturing Materials and Machinery Parts)

► Show Figures

Figure 1

37 pages, 6001 KB

Open AccessArticle

Deep Learning-Based Crack Detection on Cultural Heritage Surfaces

by Wei-Che Huang, Yi-Shan Luo, Wen-Cheng Liu and Hong-Ming Liu

Appl. Sci. 2025, 15(14), 7898; https://doi.org/10.3390/app15147898 - 15 Jul 2025

Viewed by 1307

Abstract

This study employs a deep learning-based object detection model, GoogleNet, to identify cracks in cultural heritage images. Subsequently, a semantic segmentation model, SegNet, is utilized to determine the location and extent of the cracks. To establish a scale ratio between image pixels and [...] Read more.

This study employs a deep learning-based object detection model, GoogleNet, to identify cracks in cultural heritage images. Subsequently, a semantic segmentation model, SegNet, is utilized to determine the location and extent of the cracks. To establish a scale ratio between image pixels and real-world dimensions, a parallel laser-based measurement approach is applied, enabling precise crack length calculations. The results indicate that the percentage error between crack lengths estimated using deep learning and those measured with a caliper is approximately 3%, demonstrating the feasibility and reliability of the proposed method. Additionally, the study examines the impact of iteration count, image quantity, and image category on the performance of GoogleNet and SegNet. While increasing the number of iterations significantly improves the models’ learning performance in the early stages, excessive iterations lead to overfitting. The optimal performance for GoogleNet was achieved at 75 iterations, whereas SegNet reached its best performance after 45,000 iterations. Similarly, while expanding the training dataset enhances model generalization, an excessive number of images may also contribute to overfitting. GoogleNet exhibited optimal performance with a training set of 66 images, while SegNet achieved the best segmentation accuracy when trained with 300 images. Furthermore, the study investigates the effect of different crack image categories by classifying datasets into four groups: general cracks, plain wall cracks, mottled wall cracks, and brick wall cracks. The findings reveal that training GoogleNet and SegNet with general crack images yielded the highest model performance, whereas training with a single crack category substantially reduced generalization capability. Full article

(This article belongs to the Special Issue Deep Learning and Machine Learning in Image Processing and Pattern Recognition)

► Show Figures

Figure 1

19 pages, 3165 KB

Open AccessArticle

Majority Voting Ensemble of Deep CNNs for Robust MRI-Based Brain Tumor Classification

by Kuo-Ying Liu, Nan-Han Lu, Yung-Hui Huang, Akari Matsushima, Koharu Kimura, Takahide Okamoto and Tai-Been Chen

Diagnostics 2025, 15(14), 1782; https://doi.org/10.3390/diagnostics15141782 - 15 Jul 2025

Viewed by 925

Abstract

Background/Objectives: Accurate classification of brain tumors is critical for treatment planning and prognosis. While deep convolutional neural networks (CNNs) have shown promise in medical imaging, few studies have systematically compared multiple architectures or integrated ensemble strategies to improve diagnostic performance. This study [...] Read more.

Background/Objectives: Accurate classification of brain tumors is critical for treatment planning and prognosis. While deep convolutional neural networks (CNNs) have shown promise in medical imaging, few studies have systematically compared multiple architectures or integrated ensemble strategies to improve diagnostic performance. This study aimed to evaluate various CNN models and optimize classification performance using a majority voting ensemble approach on T1-weighted MRI brain images. Methods: Seven pretrained CNN architectures were fine-tuned to classify four categories: glioblastoma, meningioma, pituitary adenoma, and no tumor. Each model was trained using two optimizers (SGDM and ADAM) and evaluated on a public dataset split into training (70%), validation (10%), and testing (20%) subsets, and further validated on an independent external dataset to assess generalizability. A majority voting ensemble was constructed by aggregating predictions from all 14 trained models. Performance was assessed using accuracy, Kappa coefficient, true positive rate, precision, confusion matrix, and ROC curves. Results: Among individual models, GoogLeNet and Inception-v3 with ADAM achieved the highest classification accuracy (0.987). However, the ensemble approach outperformed all standalone models, achieving an accuracy of 0.998, a Kappa coefficient of 0.997, and AUC values above 0.997 for all tumor classes. The ensemble demonstrated improved sensitivity, precision, and overall robustness. Conclusions: The majority voting ensemble of diverse CNN architectures significantly enhanced the performance of MRI-based brain tumor classification, surpassing that of any single model. These findings underscore the value of model diversity and ensemble learning in building reliable AI-driven diagnostic tools for neuro-oncology. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

17 pages, 1820 KB

Open AccessArticle

A Federated Learning Architecture for Bird Species Classification in Wetlands

by David Mulero-Pérez, Javier Rodriguez-Juan, Tamai Ramirez-Gordillo, Manuel Benavent-Lledo, Pablo Ruiz-Ponce, David Ortiz-Perez, Hugo Hernandez-Lopez, Anatoli Iarovikov, Jose Garcia-Rodriguez, Esther Sebastián-González, Olamide Jogunola, Segun I. Popoola and Bamidele Adebisi

J. Sens. Actuator Netw. 2025, 14(4), 71; https://doi.org/10.3390/jsan14040071 - 9 Jul 2025

Viewed by 1231

Abstract

Federated learning allows models to be trained on edge devices with local data, eliminating the need to share data with a central server. This significantly reduces the amount of data transferred from edge devices to central servers, which is particularly important in rural [...] Read more.

Federated learning allows models to be trained on edge devices with local data, eliminating the need to share data with a central server. This significantly reduces the amount of data transferred from edge devices to central servers, which is particularly important in rural areas with limited bandwidth resources. Despite the potential of federated learning to fine-tune deep learning models using data collected from edge devices in low-resource environments, its application in the field of bird monitoring remains underexplored. This study proposes a federated learning pipeline tailored for bird species classification in wetlands. The proposed approach is based on lightweight convolutional neural networks optimized for use on resource-constrained devices. Since the performance of federated learning is strongly influenced by the models used and the experimental setting, this study conducts a comprehensive comparison of well-known lightweight models such as WideResNet, EfficientNetV2, MNASNet, GoogLeNet and ResNet in different training settings. The results demonstrate the importance of the training setting in federated learning architectures and the suitability of the different models for bird species recognition. This work contributes to the wider application of federated learning in ecological monitoring and highlights its potential to overcome challenges such as bandwidth limitations. Full article

(This article belongs to the Special Issue Federated Learning: Applications and Future Directions)

► Show Figures

Figure 1

Search Results (369)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (369)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI