Processing math: 100%
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (81)

Search Parameters:
Keywords = MobileVit

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 17966 KiB  
Article
CTIFERK: A Thermal Infrared Facial Expression Recognition Model with Kolmogorov–Arnold Networks for Smart Classrooms
by Zhaoyu Shou, Yongsheng Tang, Dongxu Li, Jianwen Mo and Cheng Feng
Symmetry 2025, 17(6), 864; https://doi.org/10.3390/sym17060864 - 2 Jun 2025
Viewed by 52
Abstract
Accurate recognition of student emotions in smart classrooms is vital for understanding learning states. Visible light-based facial expression recognition is often affected by illumination changes, making thermal infrared imaging a promising alternative due to its robust temperature distribution symmetry. This paper proposes CTIFERK, [...] Read more.
Accurate recognition of student emotions in smart classrooms is vital for understanding learning states. Visible light-based facial expression recognition is often affected by illumination changes, making thermal infrared imaging a promising alternative due to its robust temperature distribution symmetry. This paper proposes CTIFERK, a thermal infrared facial expression recognition model integrating Kolmogorov–Arnold Networks (KANs). By incorporating multiple KAN layers, CTIFERK enhances feature extraction and fitting capabilities. It also balances pooling layer information from the MobileViT backbone to preserve symmetrical facial features, improving recognition accuracy. Experiments on the Tufts Face Database, the IRIS Database, and the self-constructed GUET thermalface dataset show that CTIFERK achieves accuracies of 81.82%, 82.19%, and 65.22%, respectively, outperforming baseline models. These results validate CTIFERK’s effectiveness and superiority for thermal infrared expression recognition in smart classrooms, enabling reliable emotion monitoring. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

20 pages, 7658 KiB  
Article
Assessment Method for Feeding Intensity of Fish Schools Using MobileViT-CoordAtt
by Shikun Liu, Xingguo Liu and Haisheng Zou
Fishes 2025, 10(6), 253; https://doi.org/10.3390/fishes10060253 - 28 May 2025
Viewed by 69
Abstract
Assessment of fish feeding intensity is crucial for achieving precise feeding and for enhancing aquaculture efficiency. However, complex backgrounds in real-world aquaculture scenarios—such as water surface reflections, wave disturbances, and the stochastic movement of fish schools—pose significant challenges to the precise extraction of [...] Read more.
Assessment of fish feeding intensity is crucial for achieving precise feeding and for enhancing aquaculture efficiency. However, complex backgrounds in real-world aquaculture scenarios—such as water surface reflections, wave disturbances, and the stochastic movement of fish schools—pose significant challenges to the precise extraction of feeding-related features. To address this issue, this study proposes a fish feeding intensity assessment method based on MobileViT-CoordAtt. The method employs a lightweight MobileViT backbone network, integrated with a Coordinate Attention (CoordAtt) mechanism and a multi-scale feature fusion strategy. Specifically, the CoordAtt module enhances the model’s spatial perception by encoding spatial coordinate information, enabling precise capture of the spatial distribution characteristics of fish schools. The multi-scale feature fusion strategy adopts a three-level feature integration approach (input features, local features, and global features) to further strengthen the model’s representational capacity, ensuring robust extraction of key feeding-related features across diverse scales and hierarchical levels. Experimental results demonstrate that the MobileViT-CoordAtt model, trained with transfer learning, achieves an accuracy of 97.18% on the test set, with a compact parameter size of 4.09 MB. These findings indicate that the proposed method can effectively evaluate fish feeding intensity in practical aquaculture environments, providing critical support for formulating dynamic feeding strategies. Full article
(This article belongs to the Special Issue Development and Innovations of Smart Aquaculture Technologies)
Show Figures

Figure 1

18 pages, 11374 KiB  
Article
A Novel Lightweight Algorithm for Sonar Image Recognition
by Gang Wan, Qi He, Qianqian Zhang, Hanren Wang, Huanru Sun, Xinnan Fan and Pengfei Shi
Sensors 2025, 25(11), 3329; https://doi.org/10.3390/s25113329 - 26 May 2025
Viewed by 118
Abstract
Sonar images possess characteristics such as low resolution, high noise, and blurred edges. Utilizing CNNs would lead to problems such as inadequate target recognition accuracy. Moreover, due to their larger sizes and higher computational requirements, existing CNNs face deployment issues in embedded devices. [...] Read more.
Sonar images possess characteristics such as low resolution, high noise, and blurred edges. Utilizing CNNs would lead to problems such as inadequate target recognition accuracy. Moreover, due to their larger sizes and higher computational requirements, existing CNNs face deployment issues in embedded devices. Therefore, we propose a sonar image recognition algorithm optimized for the lightweight algorithm, MobileViT, by analyzing the features of sonar images. Firstly, the MobileViT block is modified by adding and redesigning the jump connection layer to capture more important features of sonar images. Secondly, the original 1 × 1 convolution is replaced with the redesigned multi-scale convolution Res2Net in the MV2 module to enhance the ability of the algorithm to learn global and local features. Finally, the IB loss is applied to address the imbalance of sample categories in the sonar dataset, assigning different weights to the samples to improve the performance of the network. The experimental results show that several proposed improvements have improved the accuracy of sonar image recognition to varying degrees. At the same time, the proposed algorithm is lightweight and can be deploy on embedded devices. Full article
Show Figures

Figure 1

21 pages, 11638 KiB  
Article
YOLOv8-MFD: An Enhanced Detection Model for Pine Wilt Diseased Trees Using UAV Imagery
by Hua Shi, Yonghang Wang, Xiaozhou Feng, Yufen Xie, Zhenhui Zhu, Hui Guo and Guofeng Jin
Sensors 2025, 25(11), 3315; https://doi.org/10.3390/s25113315 - 24 May 2025
Viewed by 320
Abstract
Pine Wilt Disease (PWD) is a highly infectious and lethal disease that severely threatens global pine forest ecosystems and forestry economies. Early and accurate detection of infected trees is crucial to prevent large-scale outbreaks and support timely forest management. However, existing remote sensing-based [...] Read more.
Pine Wilt Disease (PWD) is a highly infectious and lethal disease that severely threatens global pine forest ecosystems and forestry economies. Early and accurate detection of infected trees is crucial to prevent large-scale outbreaks and support timely forest management. However, existing remote sensing-based detection models often struggle with performance degradation in complex environments, as well as a trade-off between detection accuracy and real-time efficiency. To address these challenges, we propose an improved object detection model, YOLOv8-MFD, designed for accurate and efficient detection of PWD-infected trees from UAV imagery. The model incorporates a MobileViT-based backbone that fuses convolutional neural networks with Transformer-based global modeling to enhance feature representation under complex forest backgrounds. To further improve robustness and precision, we integrate a Focal Modulation mechanism to suppress environmental interference and adopt a Dynamic Head to strengthen multi-scale object perception and adaptive feature fusion. Experimental results on a UAV-based forest dataset demonstrate that YOLOv8-MFD achieves a precision of 92.5%, a recall of 84.7%, an F1-score of 88.4%, and a mAP@0.5 of 88.2%. Compared to baseline models such as YOLOv8 and YOLOv10, our method achieves higher accuracy while maintaining acceptable computational cost (11.8 GFLOPs) and a compact model size (10.2 MB). Its inference speed is moderate and still suitable for real-time deployment. Overall, the proposed method offers a reliable solution for early-stage PWD monitoring across large forested areas, enabling more timely disease intervention and resource protection. Furthermore, its generalizable architecture holds promise for broader applications in forest health monitoring and agricultural disease detection. Full article
(This article belongs to the Special Issue Sensor-Fusion-Based Deep Interpretable Networks)
Show Figures

Figure 1

23 pages, 19606 KiB  
Article
Lubricating Grease Thickness Classification of Steel Wire Rope Surface Based on GEMR-MobileViT
by Ruqing Gong, Yuemin Wang, Fan Zhou and Binghui Tang
Sensors 2025, 25(9), 2738; https://doi.org/10.3390/s25092738 - 26 Apr 2025
Viewed by 259
Abstract
Proper surface lubrication with optimal grease thickness is essential for extending steel wire rope service life. To achieve automated lubrication quality control and address challenges like variable lighting and motion blur that degrade recognition accuracy in practical settings, this paper proposes an improved [...] Read more.
Proper surface lubrication with optimal grease thickness is essential for extending steel wire rope service life. To achieve automated lubrication quality control and address challenges like variable lighting and motion blur that degrade recognition accuracy in practical settings, this paper proposes an improved lightweight GEMR-MobileViT. The model is designed to identify the grease thickness on steel wire rope surfaces while mitigating the high parameters and computational complexity of existing models. In this model, part of the standard convolution is replaced by GhostConv, a novel efficient multi-scale attention (EMA) module is introduced into the local expression part of the MobileViT block, and the structure of residual connections within the MobileViT block is designed. A transfer learning method is then employed. A custom dataset of steel wire rope lubrication images was constructed for model training. The experimental results demonstrated that GEMR-MobileViT achieved a recognition accuracy of 96.63% across five grease thickness categories, with 4.19 M params and 1.31 GFLOPs computational complexity. Compared to the pre-improvement version, recognition accuracy improved by 4.4%, while its parameters and computational complexity were reduced by 15.2% and 10.3%, respectively. When compared with current mainstream classification models such as ConvNeXtV2, EfficientNetV2, EdgeNeXt, NextViT, and MobileNetV4, our GEMR-MobileViT achieved superior recognition accuracy and demonstrated significant advantages in its model parameters, striking a good balance between recognition precision and model size. The proposed model facilitates deployment in steel wire rope lubrication working sites, enabling the real-time monitoring of surface grease thickness, thereby offering a novel approach for automating steel wire rope maintenance. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

25 pages, 4027 KiB  
Article
Edge-Optimized Deep Learning Architectures for Classification of Agricultural Insects with Mobile Deployment
by Muhammad Hannan Akhtar, Ibrahim Eksheir and Tamer Shanableh
Information 2025, 16(5), 348; https://doi.org/10.3390/info16050348 - 25 Apr 2025
Viewed by 476
Abstract
The deployment of machine learning models on mobile platforms has ushered in a new era of innovation across diverse sectors, including agriculture, where such applications hold immense promise for empowering farmers with cutting-edge technologies. In this context, the threat posed by insects to [...] Read more.
The deployment of machine learning models on mobile platforms has ushered in a new era of innovation across diverse sectors, including agriculture, where such applications hold immense promise for empowering farmers with cutting-edge technologies. In this context, the threat posed by insects to crop yields during harvest has escalated, fueled by factors such as evolution and climate change-induced shifts in insect behavior. To address this challenge, smart insect monitoring systems and detection models have emerged as crucial tools for farmers and IoT-based systems, enabling interventions to safeguard crops. The primary contribution of this study lies in its systematic investigation of model optimization techniques for edge deployment, including Post-Training Quantization, Quantization-Aware Training, and Data Representative Quantization. As such, we address the crucial need for efficient, on-site pest detection tools in agricultural settings. We provide a detailed analysis of the trade-offs between model size, inference speed, and accuracy across different optimization approaches, ensuring practical applicability in resource-constrained farming environments. Our study explores various methodologies for model development, including the utilization of Mobile-ViT and EfficientNet architectures, coupled with transfer learning and fine-tuning techniques. Using the Dangerous Farm Insects Dataset, we achieve an accuracy of 82.6% and 77.8% on validation and test datasets, respectively, showcasing the efficacy of our approach. Furthermore, we investigate quantization techniques to optimize model performance for on-device inference, ensuring seamless deployment on mobile devices and other edge devices without compromising accuracy. The best quantized model, produced through Post-Training Quantization, was able to maintain a classification accuracy of 77.8% while significantly reducing the model size from 33 MB to 9.6 MB. To validate the generalizability of our solution, we extended our experiments to the larger IP102 dataset. The quantized model produced using Post-Training Quantization was able to maintain a classification accuracy of 59.6% while also reducing the model size from 33 MB to 9.6 MB, thus demonstrating that our solution maintains a competitive performance across a broader range of insect classes. Full article
(This article belongs to the Special Issue Intelligent Information Technology)
Show Figures

Figure 1

27 pages, 13928 KiB  
Article
Sea Surface Floating Small-Target Detection Based on Dual-Feature Images and Improved MobileViT
by Yang Liu, Hongyan Xing and Tianhao Hou
J. Mar. Sci. Eng. 2025, 13(3), 572; https://doi.org/10.3390/jmse13030572 - 14 Mar 2025
Viewed by 580
Abstract
Small-target detection in sea clutter is a key challenge in marine radar surveillance, crucial for maritime safety and target identification. This study addresses the challenge of weak feature representation in one-dimensional (1D) sea clutter time-series analysis and suboptimal detection performance for sea surface [...] Read more.
Small-target detection in sea clutter is a key challenge in marine radar surveillance, crucial for maritime safety and target identification. This study addresses the challenge of weak feature representation in one-dimensional (1D) sea clutter time-series analysis and suboptimal detection performance for sea surface small targets. A novel dual-feature image detection method incorporating an improved mobile vision transformer (MobileViT) network is proposed to overcome these limitations. The method converts 1D sea clutter signals into two-dimensional (2D) fused images by means of a Gramian angular difference field (GADF) and recurrence plot (RP), enhancing the model’s key-information extraction. The improved MobileViT architecture enhances detection capabilities through multi-scale feature fusion with local–global information interaction, integration of coordinate attention (CA) for directional spatial feature enhancement, and replacement of ReLU6 with SiLU activation in MobileNetV2 (MV2) modules to boost nonlinear representation. Experimental results on the IPIX dataset demonstrate that dual-feature images outperform single-feature images in detection under a 103 constant false-alarm rate (FAR) condition. The improved MobileViT attains 98.6% detection accuracy across all polarization modes, significantly surpassing other advanced methods. This study provides a new paradigm for time-series radar signal analysis through image-based deep learning fusion. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Graphical abstract

32 pages, 12235 KiB  
Article
Explainable MRI-Based Ensemble Learnable Architecture for Alzheimer’s Disease Detection
by Opeyemi Taiwo Adeniran, Blessing Ojeme, Temitope Ezekiel Ajibola, Ojonugwa Oluwafemi Ejiga Peter, Abiola Olayinka Ajala, Md Mahmudur Rahman and Fahmi Khalifa
Algorithms 2025, 18(3), 163; https://doi.org/10.3390/a18030163 - 13 Mar 2025
Viewed by 699
Abstract
With the advancements in deep learning methods, AI systems now perform at the same or higher level than human intelligence in many complex real-world problems. The data and algorithmic opacity of deep learning models, however, make the task of comprehending the input data [...] Read more.
With the advancements in deep learning methods, AI systems now perform at the same or higher level than human intelligence in many complex real-world problems. The data and algorithmic opacity of deep learning models, however, make the task of comprehending the input data information, the model, and model’s decisions quite challenging. This lack of transparency constitutes both a practical and an ethical issue. For the present study, it is a major drawback to the deployment of deep learning methods mandated with detecting patterns and prognosticating Alzheimer’s disease. Many approaches presented in the AI and medical literature for overcoming this critical weakness are sometimes at the cost of sacrificing accuracy for interpretability. This study is an attempt at addressing this challenge and fostering transparency and reliability in AI-driven healthcare solutions. The study explores a few commonly used perturbation-based interpretability (LIME) and gradient-based interpretability (Saliency and Grad-CAM) approaches for visualizing and explaining the dataset, models, and decisions of MRI image-based Alzheimer’s disease identification using the diagnostic and predictive strengths of an ensemble framework comprising Convolutional Neural Networks (CNNs) architectures (Custom multi-classifier CNN, VGG-19, ResNet, MobileNet, EfficientNet, DenseNet), and a Vision Transformer (ViT). The experimental results show the stacking ensemble achieving a remarkable accuracy of 98.0% while the hard voting ensemble reached 97.0%. The findings present a valuable contribution to the growing field of explainable artificial intelligence (XAI) in medical imaging, helping end users and researchers to gain deep understanding of the backstory behind medical image dataset and deep learning model’s decisions. Full article
(This article belongs to the Special Issue Algorithms for Computer Aided Diagnosis: 2nd Edition)
Show Figures

Figure 1

22 pages, 8390 KiB  
Article
Dual-Attention-Enhanced MobileViT Network: A Lightweight Model for Rice Disease Identification in Field-Captured Images
by Meng Zhang, Zichao Lin, Shuqi Tang, Chenjie Lin, Liping Zhang, Wei Dong and Nan Zhong
Agriculture 2025, 15(6), 571; https://doi.org/10.3390/agriculture15060571 - 7 Mar 2025
Cited by 1 | Viewed by 816
Abstract
Accurate identification of rice diseases is crucial for improving rice yield and ensuring food security. In this study, we constructed an image dataset containing six classes of rice diseases captured under real field conditions to address challenges such as complex backgrounds, varying lighting, [...] Read more.
Accurate identification of rice diseases is crucial for improving rice yield and ensuring food security. In this study, we constructed an image dataset containing six classes of rice diseases captured under real field conditions to address challenges such as complex backgrounds, varying lighting, and symptom similarities. Based on the MobileViT-XXS architecture, we proposed an enhanced model named MobileViT-DAP, which integrates Channel Attention (CA), Efficient Channel Attention (ECA), and PoolFormer blocks to achieve precise classification of rice diseases. The experimental results demonstrated that the improved model achieved superior performance with 0.75 M Params and 0.23 G FLOPs, ensuring computational efficiency while maintaining high classification accuracy. On the testing set, the model achieved an accuracy of 99.61%, a precision of 99.64%, a recall of 99.59%, and a specificity of 99.92%. Compared to traditional lightweight models, MobileViT-DAP showed significant improvements in model complexity, computational efficiency, and classification performance, effectively balancing lightweight design with high accuracy. Furthermore, visualization analysis confirmed that the model’s decision-making process primarily relies on lesion-related features, enhancing its interpretability and reliability. This study provides a novel perspective for optimizing plant disease recognition tasks and contributes to improving plant protection strategies, offering a solution for accurate and efficient disease monitoring in agricultural applications. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

29 pages, 9831 KiB  
Article
Quality of Experience (QoE) in Cloud Gaming: A Comparative Analysis of Deep Learning Techniques via Facial Emotions in a Virtual Reality Environment
by Awais Khan Jumani, Jinglun Shi, Asif Ali Laghari, Muhammad Ahmad Amin, Aftab ul Nabi, Kamlesh Narwani and Yi Zhang
Sensors 2025, 25(5), 1594; https://doi.org/10.3390/s25051594 - 5 Mar 2025
Viewed by 882
Abstract
Cloud gaming has rapidly transformed the gaming industry, allowing users to play games on demand from anywhere without the need for powerful hardware. Cloud service providers are striving to enhance user Quality of Experience (QoE) using traditional assessment methods. However, these traditional methods [...] Read more.
Cloud gaming has rapidly transformed the gaming industry, allowing users to play games on demand from anywhere without the need for powerful hardware. Cloud service providers are striving to enhance user Quality of Experience (QoE) using traditional assessment methods. However, these traditional methods often fail to capture the actual user QoE because some users are not serious about providing feedback regarding cloud services. Additionally, some players, even after receiving services as per the Service Level Agreement (SLA), claim that they are not receiving services as promised. This poses a significant challenge for cloud service providers in accurately identifying QoE and improving actual services. In this paper, we have compared our previous proposed novel technique that utilizes a deep learning (DL) model to assess QoE through players’ facial expressions during cloud gaming sessions in a virtual reality (VR) environment. The EmotionNET model technique is based on a convolutional neural network (CNN) architecture. Later, we have compared the EmotionNET technique with three other DL techniques, namely ConvoNEXT, EfficientNET, and Vision Transformer (ViT). We trained the EmotionNET, ConvoNEXT, EfficientNET, and ViT model techniques on our custom-developed dataset, achieving 98.9% training accuracy and 87.8% validation accuracy with the EmotionNET model technique. Based on the training and comparison results, it is evident that the EmotionNET model technique predicts and performs better than the other model techniques. At the end, we have compared the EmotionNET results on two network (WiFi and mobile data) datasets. Our findings indicate that facial expressions are strongly correlated with QoE. Full article
Show Figures

Figure 1

24 pages, 16681 KiB  
Article
A Deep Ensemble Learning Approach Based on a Vision Transformer and Neural Network for Multi-Label Image Classification
by Anas W. Abulfaraj and Faisal Binzagr
Big Data Cogn. Comput. 2025, 9(2), 39; https://doi.org/10.3390/bdcc9020039 - 11 Feb 2025
Viewed by 1316
Abstract
Convolutional Neural Networks (CNNs) have proven to be very effective in image classification due to their status as a powerful feature learning algorithm. Traditional approaches have considered the problem of multiclass classification, where the goal is to classify a set of objects at [...] Read more.
Convolutional Neural Networks (CNNs) have proven to be very effective in image classification due to their status as a powerful feature learning algorithm. Traditional approaches have considered the problem of multiclass classification, where the goal is to classify a set of objects at once. However, co-occurrence can make the discriminative features of the target less salient and may lead to overfitting of the model, resulting in lower performance. To address this, we propose a multi-label classification ensemble model including a Vision Transformer (ViT) and CNN for directly detecting one or multiple objects in an image. First, we improve the MobileNetV2 and DenseNet201 models using extra convolutional layers to strengthen image classification. In detail, three convolution layers are applied in parallel at the end of both models. ViT can learn dependencies among distant positions and local detail, making it an effective tool for multi-label classification. Finally, an ensemble learning algorithm is used to combine the classification predictions of the ViT, the modified MobileNetV2, and DenseNet201 bands for increased image classification accuracy using a voting system. The performance of the proposed model is examined on four benchmark datasets, achieving accuracies of 98.24%, 98.89%, 99.91%, and 96.69% on ASCAL VOC 2007, PASCAL VOC 2012, MS-COCO, and NUS-WIDE 318, respectively, showing that our framework can enhance current state-of-the-art methods. Full article
Show Figures

Figure 1

26 pages, 3982 KiB  
Article
Building Better Deep Learning Models Through Dataset Fusion: A Case Study in Skin Cancer Classification with Hyperdatasets
by Panagiotis Georgiadis, Emmanouil V. Gkouvrikos, Eleni Vrochidou, Theofanis Kalampokas and George A. Papakostas
Diagnostics 2025, 15(3), 352; https://doi.org/10.3390/diagnostics15030352 - 3 Feb 2025
Cited by 2 | Viewed by 1926
Abstract
Background/Objectives: This work brings to light the importance of forming large training datasets with diverse images generated and proposes an image dataset merging application, namely, the Data Merger App, to streamline the management and synthesis of large-scale datasets. The Data Merger can recognize [...] Read more.
Background/Objectives: This work brings to light the importance of forming large training datasets with diverse images generated and proposes an image dataset merging application, namely, the Data Merger App, to streamline the management and synthesis of large-scale datasets. The Data Merger can recognize common classes across various datasets and provides tools to combine and organize them in a well-structured and easily accessible way. Methods: A case study is then presented, leveraging four different Convolutional Neural Network (CNN) models, VGG16, ResNet50, MobileNetV3-small, and DenseNet-161, and a Visual Transformer (ViT), to benchmark their performance to classify skin cancer images, when trained on single datasets and on enhanced hyperdatasets generated by the Data Merger App. Results: Extended experimental results indicated that enhanced hyperdatasets are efficient and able to improve the accuracies of classification models, whether the models are trained from scratch or by using Transfer Learning. Moreover, the ViT model was reported for higher classification accuracies compared to CNNs on datasets with a limited number of classes, reporting 91.87% accuracy for 9 classes, as well as in the case of enhanced hyperdatasets with multiple numbers of classes, reporting accuracy of 58% for 32 classes. Conclusions: In essence, this work demonstrates the great significance of data combination, as well as the utility value of the developed prototype web application as a critical tool for researchers and data scientists, enabling them to easily handle complex datasets, combine datasets into larger diverse versions, to further enhance the generalization ability of models and improve the quality and impact of their work. Full article
(This article belongs to the Special Issue Deep Learning in Medical and Biomedical Image Processing)
Show Figures

Figure 1

19 pages, 2144 KiB  
Article
PDNet by Partial Deep Convolution: A Better Lightweight Detector
by Wei Wang, Yuanze Meng, Han Li, Shun Li, Chenghong Zhang, Guanghui Zhang and Weimin Lei
Electronics 2025, 14(3), 591; https://doi.org/10.3390/electronics14030591 - 2 Feb 2025
Viewed by 569
Abstract
Model lightweighting is significant in edge computing and mobile devices. Current studies on fast network design mainly focuses on model computation compression and speedup. Many models aim to compress models by dealing with redundant feature maps. However, most of these methods choose to [...] Read more.
Model lightweighting is significant in edge computing and mobile devices. Current studies on fast network design mainly focuses on model computation compression and speedup. Many models aim to compress models by dealing with redundant feature maps. However, most of these methods choose to preserve the feature maps with simple manipulations and do not effectively reduce redundant feature maps. This paper proposes a new convolution module, PDConv, which compresses redundant feature maps to reduce network complexity and increase network width to maintain accuracy. PDConv (Partial Deep Convolution) outperforms traditional methods in handling redundant feature maps, particularly in deep networks. Its FLOPs are comparable to deep separable convolution but with higher accuracy. This paper proposes PDBottleNeck and PDC2f (Partial Deep CSPDarknet53 to 2-Stage FPN) and build the lightweight network PDNet for experimental validation using the PASCAL VOC dataset. Compared to the popular HorNet, our method achieves an improvement of more than 25% in FLOPs and 1.8% in mAP50:95 accuracy. On the CoCo2017 dataset, our large PDNet achieves a 0.5% improvement in mAP75 and lower FLOPs than the latest RepVit. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

27 pages, 5472 KiB  
Article
A Deep Learning Approach for the Classification of Fibroglandular Breast Density in Histology Images of Human Breast Tissue
by Hanieh Heydarlou, Leigh J. Hodson, Mohsen Dorraki, Theresa E. Hickey, Wayne D. Tilley, Eric Smith, Wendy V. Ingman and Ali Farajpour
Cancers 2025, 17(3), 449; https://doi.org/10.3390/cancers17030449 - 28 Jan 2025
Viewed by 901
Abstract
Background: To progress research into the biological mechanisms that link mammographic breast density to breast cancer risk, fibroglandular breast density can be used as a surrogate measure. This study aimed to develop a computational tool to classify fibroglandular breast density in hematoxylin and [...] Read more.
Background: To progress research into the biological mechanisms that link mammographic breast density to breast cancer risk, fibroglandular breast density can be used as a surrogate measure. This study aimed to develop a computational tool to classify fibroglandular breast density in hematoxylin and eosin (H&E)-stained breast tissue sections using deep learning approaches that would assist future mammographic density research. Methods: Four different architectural configurations of transferred MobileNet-v2 convolutional neural networks (CNNs) and four different models of vision transformers were developed and trained on a database of H&E-stained normal human breast tissue sections (965 tissue blocks from 93 patients) that had been manually classified into one of five fibroglandular density classes, with class 1 being very low fibroglandular density and class 5 being very high fibroglandular density. Results: The MobileNet-Arc 1 and ViT model 1 achieved the highest overall F1 scores of 0.93 and 0.94, respectively. Both models exhibited the lowest false positive rate and highest true positive rate in class 5, while the most challenging classification was class 3, where images from classes 2 and 4 were mistakenly classified as class 3. The area under the curves (AUCs) for all classes were higher than 0.98. Conclusions: Both the ViT and MobileNet models showed promising performance in the accurate classification of H&E-stained tissue sections across all five fibroglandular density classes, providing a rapid and easy-to-use computational tool for breast density analysis. Full article
Show Figures

Figure 1

19 pages, 6578 KiB  
Article
Deep Learning Tool Wear State Identification Method Based on Cutting Force Signal
by Shuhang Li, Meiqiu Li and Yingning Gao
Sensors 2025, 25(3), 662; https://doi.org/10.3390/s25030662 - 23 Jan 2025
Cited by 1 | Viewed by 797
Abstract
The objective of this study is to accurately, expeditiously, and efficiently identify the wear state of milling cutters. To this end, a state identification method is proposed that combines continuous wavelet transform and an improved MobileViT lightweight network. The methodology involves the transformation [...] Read more.
The objective of this study is to accurately, expeditiously, and efficiently identify the wear state of milling cutters. To this end, a state identification method is proposed that combines continuous wavelet transform and an improved MobileViT lightweight network. The methodology involves the transformation of the cutting force signal during the milling cutter cutting process into a time–frequency image by continuous wavelet transform. This is followed by the introduction of a Contextual Transformer module after layer 1 and the embedding of a Global Attention Mechanism module after layer 2 of the MobileViT network structure. These modifications are intended to enhance visual representation capability, reduce information loss, and improve the interaction between global features. The result is an improvement in the overall performance of the model. The improved MobileViT network model was shown to enhance accuracy, precision, recall, and F1 score by 1.58%, 1.23%, 1.92%, and 1.57%, respectively, in comparison with the original MobileViT. The experimental results demonstrate that the proposed model in this study exhibits a substantial advantage in terms of memory occupation and prediction accuracy in comparison to models such as VGG16, ResNet18, and Pool Former. This study proposes an efficient identification method for milling cutter wear state identification, which can identify the tool wear state in near real-time. The proposed method has potential applications in the field of industrial production. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

Back to TopTop