Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (543)

Search Parameters:
Keywords = fine-grained classification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2394 KB  
Article
Extracting Emotions from Customer Reviews Using Text Mining, Large Language Models and Fine-Tuning Strategies
by Simona-Vasilica Oprea and Adela Bâra
J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 221; https://doi.org/10.3390/jtaer20030221 (registering DOI) - 1 Sep 2025
Abstract
User-generated content, such as product and app reviews, offers more than just sentiment. It provides a rich spectrum of emotional expression that reveals users’ experiences, frustrations and expectations. Traditional sentiment analysis, which typically classifies text as positive or negative, lacks the nuance needed [...] Read more.
User-generated content, such as product and app reviews, offers more than just sentiment. It provides a rich spectrum of emotional expression that reveals users’ experiences, frustrations and expectations. Traditional sentiment analysis, which typically classifies text as positive or negative, lacks the nuance needed to fully understand the emotional drivers behind customer feedback. In this research, we focus on fine-grained emotion classification using core emotions. By identifying specific emotions rather than sentiment polarity, we enable more actionable insights for e-commerce and app development, supporting strategies such as feature refinement, marketing personalization and proactive customer engagement. We leverage the Hugging Face Emotions dataset and adopt a two-phase modeling approach. In the first phase, we use a pre-trained DistilBERT model as a feature extractor and evaluate multiple classical classifiers (Logistic Regression, Support Vector Classifier, Random Forest) to establish performance baselines. In the second phase, we fine-tune the DistilBERT model end-to-end using the Hugging Face Trainer API, optimizing classification performance through task-specific adaptation. Training is tracked using the Weights & Biases (wandb) API. Comparative analysis highlights the substantial performance gains from fine-tuning, particularly in capturing informal or noisy language typical in user reviews. The final fine-tuned model is applied to a dataset of customers’ reviews, identifying the dominant emotions expressed. Our results demonstrate the practical value of emotion-aware analytics in uncovering the underlying “why” behind user sentiment, enabling more empathetic decision-making across product design, customer support and user experience (UX) strategy. Full article
Show Figures

Figure 1

28 pages, 1705 KB  
Article
Identifying Literary Microgenres and Writing Style Differences in Romanian Novels with ReaderBench and Large Language Models
by Aura Cristina Udrea, Stefan Ruseti, Vlad Pojoga, Stefan Baghiu, Andrei Terian and Mihai Dascalu
Future Internet 2025, 17(9), 397; https://doi.org/10.3390/fi17090397 (registering DOI) - 30 Aug 2025
Viewed by 45
Abstract
Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary [...] Read more.
Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary historians into one of 17 genres. Our findings reveal that most novels do not adhere to a single genre label but instead combine elements of multiple (micro)genres, challenging traditional single-label classification approaches. We employed a dual computational methodology combining an analysis with Romanian-tailored linguistic features with general-purpose LLMs. ReaderBench, a Romanian-specific framework, was utilized to extract surface, syntactic, semantic, and discourse features, capturing fine-grained linguistic patterns. Alternatively, we prompted two LLMs (Llama3.3 70B and DeepSeek-R1 70B) to predict genres at the paragraph level, leveraging their ability to detect contextual and thematic coherence across multiple narrative scales. Statistical analyses using Kruskal–Wallis and Mann–Whitney tests identified genre-defining features at both novel and chapter levels. The integration of these complementary approaches enhances microgenre detection beyond traditional classification capabilities. ReaderBench provides quantifiable linguistic evidence, while LLMs capture broader contextual patterns; together, they provide a multi-layered perspective on literary genre that reflects the complex and heterogeneous character of fictional texts. Our results argue that both language-specific and general-purpose computational tools can effectively detect stylistic diversity in Romanian fiction, opening new avenues for computational literary analysis in limited-resourced languages. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
24 pages, 2357 KB  
Article
From Vision-Only to Vision + Language: A Multimodal Framework for Few-Shot Unsound Wheat Grain Classification
by Yuan Ning, Pengtao Lv, Qinghui Zhang, Le Xiao and Caihong Wang
AI 2025, 6(9), 207; https://doi.org/10.3390/ai6090207 - 29 Aug 2025
Viewed by 161
Abstract
Precise classification of unsound wheat grains is essential for crop yields and food security, yet most existing approaches rely on vision-only models that demand large labeled datasets, which is often impractical in real-world, data-scarce settings. To address this few-shot challenge, we propose UWGC, [...] Read more.
Precise classification of unsound wheat grains is essential for crop yields and food security, yet most existing approaches rely on vision-only models that demand large labeled datasets, which is often impractical in real-world, data-scarce settings. To address this few-shot challenge, we propose UWGC, a novel vision-language framework designed for few-shot classification of unsound wheat grains. UWGC integrates two core modules: a fine-tuning module based on Adaptive Prior Refinement (APE) and a text prompt enhancement module that incorporates Advancing Textual Prompt (ATPrompt) and the multimodal model Qwen2.5-VL. The synergy between the two modules, leveraging cross-modal semantics, enhances generalization of UWGC in low-data regimes. It is offered in two variants: UWGC-F and UWGC-T, in order to accommodate different practical needs. Across few-shot settings on a public grain dataset, UWGC-F and UWGC-T consistently outperform existing vision-only and vision-language methods, highlighting their potential for unsound wheat grain classification in real-world agriculture. Full article
Show Figures

Figure 1

28 pages, 4317 KB  
Article
Multi-Scale Attention Networks with Feature Refinement for Medical Item Classification in Intelligent Healthcare Systems
by Waqar Riaz, Asif Ullah and Jiancheng (Charles) Ji
Sensors 2025, 25(17), 5305; https://doi.org/10.3390/s25175305 - 26 Aug 2025
Viewed by 391
Abstract
The increasing adoption of artificial intelligence (AI) in intelligent healthcare systems has elevated the demand for robust medical imaging and vision-based inventory solutions. For an intelligent healthcare inventory system, accurate recognition and classification of medical items, including medicines and emergency supplies, are crucial [...] Read more.
The increasing adoption of artificial intelligence (AI) in intelligent healthcare systems has elevated the demand for robust medical imaging and vision-based inventory solutions. For an intelligent healthcare inventory system, accurate recognition and classification of medical items, including medicines and emergency supplies, are crucial for ensuring inventory integrity and timely access to life-saving resources. This study presents a hybrid deep learning framework, EfficientDet-BiFormer-ResNet, that integrates three specialized components: EfficientDet’s Bidirectional Feature Pyramid Network (BiFPN) for scalable multi-scale object detection, BiFormer’s bi-level routing attention for context-aware spatial refinement, and ResNet-18 enhanced with triplet loss and Online Hard Negative Mining (OHNM) for fine-grained classification. The model was trained and validated on a custom healthcare inventory dataset comprising over 5000 images collected under diverse lighting, occlusion, and arrangement conditions. Quantitative evaluations demonstrated that the proposed system achieved a mean average precision (mAP@0.5:0.95) of 83.2% and a top-1 classification accuracy of 94.7%, outperforming conventional models such as YOLO, SSD, and Mask R-CNN. The framework excelled in recognizing visually similar, occluded, and small-scale medical items. This work advances real-time medical item detection in healthcare by providing an AI-enabled, clinically relevant vision system for medical inventory management. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

24 pages, 7604 KB  
Article
Ginseng-YOLO: Integrating Local Attention, Efficient Downsampling, and Slide Loss for Robust Ginseng Grading
by Yue Yu, Dongming Li, Shaozhong Song, Haohai You, Lijuan Zhang and Jian Li
Horticulturae 2025, 11(9), 1010; https://doi.org/10.3390/horticulturae11091010 - 25 Aug 2025
Viewed by 365
Abstract
Understory-cultivated Panax ginseng possesses high pharmacological and economic value; however, its visual quality grading predominantly relies on subjective manual assessment, constraining industrial scalability. To address challenges including fine-grained morphological variations, boundary ambiguity, and complex natural backgrounds, this study proposes Ginseng-YOLO, a lightweight and [...] Read more.
Understory-cultivated Panax ginseng possesses high pharmacological and economic value; however, its visual quality grading predominantly relies on subjective manual assessment, constraining industrial scalability. To address challenges including fine-grained morphological variations, boundary ambiguity, and complex natural backgrounds, this study proposes Ginseng-YOLO, a lightweight and deployment-friendly object detection model for automated ginseng grade classification. The model is built on the YOLOv11n (You Only Look Once11n) framework and integrates three complementary components: (1) C2-LWA, a cross-stage local window attention module that enhances discrimination of key visual features, such as primary root contours and fibrous textures; (2) ADown, a non-parametric downsampling mechanism that substitutes convolution operations with parallel pooling, markedly reducing computational complexity; and (3) Slide Loss, a piecewise IoU-weighted loss function designed to emphasize learning from samples with ambiguous or irregular boundaries. Experimental results on a curated multi-grade ginseng dataset indicate that Ginseng-YOLO achieves a Precision of 84.9%, a Recall of 83.9%, and an mAP@50 of 88.7%, outperforming YOLOv11n and other state-of-the-art variants. The model maintains a compact footprint, with 2.0 M parameters, 5.3 GFLOPs, and 4.6 MB model size, supporting real-time deployment on edge devices. Ablation studies further confirm the synergistic contributions of the proposed modules in enhancing feature representation, architectural efficiency, and training robustness. Successful deployment on the NVIDIA Jetson Nano demonstrates practical real-time inference capability under limited computational resources. This work provides a scalable approach for intelligent grading of forest-grown ginseng and offers methodological insights for the design of lightweight models in medicinal plants and agricultural applications. Full article
(This article belongs to the Section Medicinals, Herbs, and Specialty Crops)
Show Figures

Figure 1

26 pages, 30652 KB  
Article
Hybrid ViT-RetinaNet with Explainable Ensemble Learning for Fine-Grained Vehicle Damage Classification
by Ananya Saha, Mahir Afser Pavel, Md Fahim Shahoriar Titu, Afifa Zain Apurba and Riasat Khan
Vehicles 2025, 7(3), 89; https://doi.org/10.3390/vehicles7030089 - 25 Aug 2025
Viewed by 330
Abstract
Efficient and explainable vehicle damage inspection is essential due to the increasing complexity and volume of vehicular incidents. Traditional manual inspection approaches are not time-effective, prone to human error, and lead to inefficiencies in insurance claims and repair workflows. Existing deep learning methods, [...] Read more.
Efficient and explainable vehicle damage inspection is essential due to the increasing complexity and volume of vehicular incidents. Traditional manual inspection approaches are not time-effective, prone to human error, and lead to inefficiencies in insurance claims and repair workflows. Existing deep learning methods, such as CNNs, often struggle with generalization, require large annotated datasets, and lack interpretability. This study presents a robust and interpretable deep learning framework for vehicle damage classification, integrating Vision Transformers (ViTs) and ensemble detection strategies. The proposed architecture employs a RetinaNet backbone with a ViT-enhanced detection head, implemented in PyTorch using the Detectron2 object detection technique. It is pretrained on COCO weights and fine-tuned through focal loss and aggressive augmentation techniques to improve generalization under real-world damage variability. The proposed system applies the Weighted Box Fusion (WBF) ensemble strategy to refine detection outputs from multiple models, offering improved spatial precision. To ensure interpretability and transparency, we adopt numerous explainability techniques—Grad-CAM, Grad-CAM++, and SHAP—offering semantic and visual insights into model decisions. A custom vehicle damage dataset with 4500 images has been built, consisting of approximately 60% curated images collected through targeted web scraping and crawling covering various damage types (such as bumper dents, panel scratches, and frontal impacts), along with 40% COCO dataset images to support model generalization. Comparative evaluations show that Hybrid ViT-RetinaNet achieves superior performance with an F1-score of 84.6%, mAP of 87.2%, and 22 FPS inference speed. In an ablation analysis, WBF, augmentation, transfer learning, and focal loss significantly improve performance, with focal loss increasing F1 by 6.3% for underrepresented classes and COCO pretraining boosting mAP by 8.7%. Additional architectural comparisons demonstrate that our full hybrid configuration not only maintains competitive accuracy but also achieves up to 150 FPS, making it well suited for real-time use cases. Robustness tests under challenging conditions, including real-world visual disturbances (smoke, fire, motion blur, varying lighting, and occlusions) and artificial noise (Gaussian; salt-and-pepper), confirm the model’s generalization ability. This work contributes a scalable, explainable, and high-performance solution for real-world vehicle damage diagnostics. Full article
Show Figures

Figure 1

27 pages, 8196 KB  
Article
Enhancing Electric Vehicle Charging Infrastructure Planning with Pre-Trained Language Models and Spatial Analysis: Insights from Beijing User Reviews
by Yanxin Hou, Peipei Wang, Zhuozhuang Yao, Xinqi Zheng and Ziying Chen
ISPRS Int. J. Geo-Inf. 2025, 14(9), 325; https://doi.org/10.3390/ijgi14090325 - 24 Aug 2025
Viewed by 310
Abstract
With the growing adoption of electric vehicles, optimizing the user experience of charging infrastructure has become critical. However, extracting actionable insights from the vast number of user reviews remains a significant challenge, impeding demand-driven operational planning for charging stations and degrading the user [...] Read more.
With the growing adoption of electric vehicles, optimizing the user experience of charging infrastructure has become critical. However, extracting actionable insights from the vast number of user reviews remains a significant challenge, impeding demand-driven operational planning for charging stations and degrading the user experience. This study leverages three pre-trained language models to perform sentiment classification and multi-level topic identification on 168,129 user reviews from Beijing, facilitating a comprehensive understanding of user feedback. The experimental results reveal significant task-model specialization: RoBERTa-WWM excels in sentiment analysis (accuracy = 0.917) and fine-grained topic identification (Micro-F1 = 0.844), making it ideal for deep semantic extraction. Conversely, ELECTRA, after sufficient training, demonstrates a strong aptitude for coarse-grained topic summarization, highlighting its strength in high-level semantic generalization. Notably, the models offer capabilities beyond simple classification, including autonomous label normalization and the extraction of valuable information from comments with low information density. Furthermore, integrating textual and spatial analyses revealed striking patterns. We identified an urban–rural emotional gap—suburban users are more satisfied despite fewer facilities—and used geographically weighted regression (GWR) to quantify the spatial differences in the factors affecting user satisfaction in Beijing’s districts. We identified three types of areas requiring differentiated strategies, as follows: the northwestern region is highly sensitive to equipment quality, the central urban area has a complex relationship between supporting facilities and satisfaction, and the emerging adoption area is more sensitive to accessibility and price factors. These findings offer a data-driven framework for charging infrastructure planning, enabling operators to base decisions on real-world user feedback and tailor solutions to specific local contexts. Full article
Show Figures

Figure 1

17 pages, 3907 KB  
Article
Motion Intention Prediction for Lumbar Exoskeletons Based on Attention-Enhanced sEMG Inference
by Mingming Wang, Linsen Xu, Zhihuan Wang, Qi Zhu and Tao Wu
Biomimetics 2025, 10(9), 556; https://doi.org/10.3390/biomimetics10090556 - 22 Aug 2025
Viewed by 346
Abstract
Exoskeleton robots function as augmentation systems that establish mechanical couplings with the human body, substantially enhancing the wearer’s biomechanical capabilities through assistive torques. We introduce a lumbar spine-assisted exoskeleton design based on Variable-Stiffness Pneumatic Artificial Muscles (VSPAM) and develop a dynamic adaptation mechanism [...] Read more.
Exoskeleton robots function as augmentation systems that establish mechanical couplings with the human body, substantially enhancing the wearer’s biomechanical capabilities through assistive torques. We introduce a lumbar spine-assisted exoskeleton design based on Variable-Stiffness Pneumatic Artificial Muscles (VSPAM) and develop a dynamic adaptation mechanism bridging the pneumatic drive module with human kinematic intent to facilitate human–robot cooperative control. For kinematic intent resolution, we propose a multimodal fusion architecture integrating the VGG16 convolutional network with Long Short-Term Memory (LSTM) networks. By incorporating self-attention mechanisms, we construct a fine-grained relational inference module that leverages multi-head attention weight matrices to capture global spatio-temporal feature dependencies, overcoming local feature constraints inherent in traditional algorithms. We further employ cross-attention mechanisms to achieve deep fusion of visual and kinematic features, establishing aligned intermodal correspondence to mitigate unimodal perception limitations. Experimental validation demonstrates 96.1% ± 1.2% motion classification accuracy, offering a novel technical solution for rehabilitation robotics and industrial assistance. Full article
(This article belongs to the Special Issue Advanced Service Robots: Exoskeleton Robots 2025)
Show Figures

Figure 1

31 pages, 8900 KB  
Article
Attention-Fused Staged DWT-LSTM for Fault Diagnosis of Embedded Sensors in Asphalt Pavement
by Jiarui Zhang, Haihui Duan, Songtao Lv, Dongdong Ge and Chaoyue Rao
Materials 2025, 18(16), 3917; https://doi.org/10.3390/ma18163917 - 21 Aug 2025
Viewed by 402
Abstract
Fault diagnosis for embedded sensors in asphalt pavement faces significant challenges, including the scarcity of real-world fault data and the difficulty in identifying compound faults, which severely compromises the reliability of monitoring data. To address these issues, this study proposes an intelligent diagnostic [...] Read more.
Fault diagnosis for embedded sensors in asphalt pavement faces significant challenges, including the scarcity of real-world fault data and the difficulty in identifying compound faults, which severely compromises the reliability of monitoring data. To address these issues, this study proposes an intelligent diagnostic framework that integrates a Discrete Wavelet Transform (DWT) with a staged, attention-based Long Short-Term Memory (LSTM) network. First, various fault modes were systematically defined, including short-term (i.e., bias, gain, and detachment), long-term (i.e., drift), and their compound forms. A fine-grained fault injection and labeling strategy was then developed to generate a comprehensive dataset. Second, a novel diagnostic model was designed based on a “Decomposition-Focus-Fusion” architecture. In this architecture, the DWT is employed to extract multi-scale features, and independent sub-models—a Bidirectional LSTM (Bi-LSTM) and a stacked LSTM—are subsequently utilized to specialize in learning short-term and long-term fault characteristics, respectively. Finally, an attention network intelligently weights and fuses the outputs from these sub-models to achieve precise classification of eight distinct sensor operational states. Validated through rigorous 5-fold cross-validation, experimental results demonstrate that the proposed framework achieves a mean diagnostic accuracy of 98.89% (±0.0040) on the comprehensive test set, significantly outperforming baseline models such as SVM, KNN, and a unified LSTM. A comprehensive ablation study confirmed that each component of the “Decomposition-Focus-Fusion” architecture—DWT features, staged training, and the attention mechanism—makes an indispensable contribution to the model’s superior performance. The model successfully distinguishes between “drift” and “normal” states—which severely confuse the baseline models—and accurately identifies various complex compound faults. Furthermore, simulated online diagnostic tests confirmed the framework’s rapid response capability to dynamic faults and its computational efficiency, meeting the demands of real-time monitoring. This study offers a precise and robust solution for the fault diagnosis of embedded sensors in asphalt pavement. Full article
Show Figures

Figure 1

22 pages, 8764 KB  
Article
Multi-Class Classification of Breast Cancer Subtypes Using ResNet Architectures on Histopathological Images
by Akshat Desai and Rakeshkumar Mahto
J. Imaging 2025, 11(8), 284; https://doi.org/10.3390/jimaging11080284 - 21 Aug 2025
Viewed by 425
Abstract
Breast cancer is a significant cause of cancer-related mortality among women around the globe, underscoring the need for early and accurate diagnosis. Typically, histopathological analysis of biopsy slides is utilized for tumor classification. However, it is labor-intensive, subjective, and often affected by inter-observer [...] Read more.
Breast cancer is a significant cause of cancer-related mortality among women around the globe, underscoring the need for early and accurate diagnosis. Typically, histopathological analysis of biopsy slides is utilized for tumor classification. However, it is labor-intensive, subjective, and often affected by inter-observer variability. Therefore, this study explores a deep learning-based, multi-class classification framework for distinguishing breast cancer subtypes using convolutional neural networks (CNNs). Unlike previous work using the popular BreaKHis dataset, where binary classification models were applied, in this work, we differentiate eight histopathological subtypes: four benign (adenosis, fibroadenoma, phyllodes tumor, and tubular adenoma) and four malignant (ductal carcinoma, lobular carcinoma, mucinous carcinoma, and papillary carcinoma). This work leverages transfer learning with ImageNet-pretrained ResNet architectures (ResNet-18, ResNet-34, and ResNet-50) and extensive data augmentation to enhance classification accuracy and robustness across magnifications. Among the ResNet models, ResNet-50 achieved the best performance, attaining a maximum accuracy of 92.42%, an AUC-ROC of 99.86%, and an average specificity of 98.61%. These findings validate the combined effectiveness of CNNs and transfer learning in capturing fine-grained histopathological features required for accurate breast cancer subtype classification. Full article
(This article belongs to the Special Issue AI-Driven Advances in Computational Pathology)
Show Figures

Figure 1

20 pages, 2115 KB  
Article
GAH-TNet: A Graph Attention-Based Hierarchical Temporal Network for EEG Motor Imagery Decoding
by Qiulei Han, Yan Sun, Hongbiao Ye, Ze Song, Jian Zhao, Lijuan Shi and Zhejun Kuang
Brain Sci. 2025, 15(8), 883; https://doi.org/10.3390/brainsci15080883 - 19 Aug 2025
Viewed by 390
Abstract
Background: Brain–computer interfaces (BCIs) based on motor imagery (MI) offer promising solutions for motor rehabilitation and communication. However, electroencephalography (EEG) signals are often characterized by low signal-to-noise ratios, strong non-stationarity, and significant inter-subject variability, which pose significant challenges for accurate decoding. Existing methods [...] Read more.
Background: Brain–computer interfaces (BCIs) based on motor imagery (MI) offer promising solutions for motor rehabilitation and communication. However, electroencephalography (EEG) signals are often characterized by low signal-to-noise ratios, strong non-stationarity, and significant inter-subject variability, which pose significant challenges for accurate decoding. Existing methods often struggle to simultaneously model the spatial interactions between EEG channels, the local fine-grained features within signals, and global semantic patterns. Methods: To address this, we propose the graph attention-based hierarchical temporal network (GAH-TNet), which integrates spatial graph attention modeling with hierarchical temporal feature encoding. Specifically, we design the graph attention temporal encoding block (GATE). The graph attention mechanism is used to model spatial dependencies between EEG channels and encode short-term temporal dynamic features. Subsequently, a hierarchical attention-guided deep temporal feature encoding block (HADTE) is introduced, which extracts local fine-grained and global long-term dependency features through two-stage attention and temporal convolution. Finally, a fully connected classifier is used to obtain the classification results. The proposed model is evaluated on two publicly available MI-EEG datasets. Results: Our method outperforms multiple existing state-of-the-art methods in classification accuracy. On the BCI IV 2a dataset, the average classification accuracy reaches 86.84%, and on BCI IV 2b, it reaches 89.15%. Ablation experiments validate the complementary roles of GATE and HADTE in modeling. Additionally, the model exhibits good generalization ability across subjects. Conclusions: This framework effectively captures the spatio-temporal dynamic characteristics and topological structure of MI-EEG signals. This hierarchical and interpretable framework provides a new approach for improving decoding performance in EEG motor imagery tasks. Full article
Show Figures

Figure 1

23 pages, 3836 KB  
Article
RUDA-2025: Depression Severity Detection Using Pre-Trained Transformers on Social Media Data
by Muhammad Ahmad, Pierpaolo Basile, Fida Ullah, Ildar Batyrshin and Grigori Sidorov
AI 2025, 6(8), 191; https://doi.org/10.3390/ai6080191 - 18 Aug 2025
Viewed by 463
Abstract
Depression is a serious mental health disorder affecting cognition, emotions, and behavior. It impacts over 300 million people globally, with mental health care costs exceeding $1 trillion annually. Traditional diagnostic methods are often expensive, time-consuming, stigmatizing, and difficult to access. This study leverages [...] Read more.
Depression is a serious mental health disorder affecting cognition, emotions, and behavior. It impacts over 300 million people globally, with mental health care costs exceeding $1 trillion annually. Traditional diagnostic methods are often expensive, time-consuming, stigmatizing, and difficult to access. This study leverages NLP techniques to identify depressive cues in social media posts, focusing on both standard Urdu and code-mixed Roman Urdu, which are often overlooked in existing research. To the best of our knowledge, a script-conversion and combination-based approach for Roman Urdu and Nastaliq Urdu has not been explored earlier. To address this gap, our study makes four key contributions. First, we created a manually annotated dataset named Ruda-2025, containing posts in code-mixed Roman Urdu and Nastaliq Urdu for both binary and multiclass classification. The binary classes are depression” and not depression, with the depression class further divided into fine-grained categories: Mild, Moderate, and Severe depression alongside not depression. Second, we applied first-time two novel techniques to the RUDA-2025 dataset: (1) script-conversion approach that translates between code-mixed Roman Urdu and Standard Urdu and (2) combination-based approach that merges both scripts to make a single dataset to address linguistic challenges in depression assessment. Finally, we employed 60 different experiments using a combination of traditional machine learning and deep learning techniques to find the best-fit model for the detection of mental disorder. Based on our analysis, our proposed model (mBERT) using custom attention mechanism outperformed baseline (XGB) in combination-based, code-mixed Roman and Nastaliq Urdu script conversions. Full article
Show Figures

Figure 1

24 pages, 18845 KB  
Article
ProtoLeafNet: A Prototype Attention-Based Leafy Vegetable Disease Detection and Segmentation Network for Sustainable Agriculture
by Yuluxin Fu and Chen Shi
Sustainability 2025, 17(16), 7443; https://doi.org/10.3390/su17167443 - 18 Aug 2025
Viewed by 437
Abstract
In response to the challenges posed by visually similar disease symptoms, complex background noise, and the need for fine-grained disease classification in leafy vegetables, this study proposes ProtoLeafNet—a prototype attention-based deep learning model for multi-task disease detection and segmentation. By integrating a class-prototype–guided [...] Read more.
In response to the challenges posed by visually similar disease symptoms, complex background noise, and the need for fine-grained disease classification in leafy vegetables, this study proposes ProtoLeafNet—a prototype attention-based deep learning model for multi-task disease detection and segmentation. By integrating a class-prototype–guided attention mechanism with a prototype loss function, the model effectively enhances the focus on lesion areas and improves category discrimination. The architecture leverages a dual-task framework that combines object detection and semantic segmentation, achieving robust performance in real agricultural scenarios. Experimental results demonstrate that the model attains a detection precision of 93.12%, recall of 90.27%, accuracy of 91.45%, and mAP scores of 91.07% and 90.25% at IoU thresholds of 50% and 75%, respectively. In the segmentation task, the model achieves a precision of 91.79%, recall of 90.80%, accuracy of 93.77%, and mAP@50 and mAP@75 both reaching 90.80%. Comparative evaluations against state-of-the-art models such as YOLOv10 and TinySegformer verify the superior detection accuracy and fine-grained segmentation ability of ProtoLeafNet. These results highlight the potential of prototype attention mechanisms in enhancing model robustness, offering practical value for intelligent disease monitoring and sustainable agriculture. Full article
Show Figures

Figure 1

37 pages, 5086 KB  
Article
Global Embeddings, Local Signals: Zero-Shot Sentiment Analysis of Transport Complaints
by Aliya Nugumanova, Daniyar Rakhimzhanov and Aiganym Mansurova
Informatics 2025, 12(3), 82; https://doi.org/10.3390/informatics12030082 - 14 Aug 2025
Viewed by 650
Abstract
Public transport agencies must triage thousands of multilingual complaints every day, yet the cost of training and serving fine-grained sentiment analysis models limits real-time deployment. The proposed “one encoder, any facet” framework therefore offers a reproducible, resource-efficient alternative to heavy fine-tuning for domain-specific [...] Read more.
Public transport agencies must triage thousands of multilingual complaints every day, yet the cost of training and serving fine-grained sentiment analysis models limits real-time deployment. The proposed “one encoder, any facet” framework therefore offers a reproducible, resource-efficient alternative to heavy fine-tuning for domain-specific sentiment analysis or opinion mining tasks on digital service data. To the best of our knowledge, we are the first to test this paradigm on operational multilingual complaints, where public transport agencies must prioritize thousands of Russian- and Kazakh-language messages each day. A human-labelled corpus of 2400 complaints is embedded with five open-source universal models. Obtained embeddings are matched to semantic “anchor” queries that describe three distinct facets: service aspect (eight classes), implicit frustration, and explicit customer request. In the strict zero-shot setting, the best encoder reaches 77% accuracy for aspect detection, 74% for frustration, and 80% for request; taken together, these signals reproduce human four-level priority in 60% of cases. Attaching a single-layer logistic probe on top of the frozen embeddings boosts performance to 89% for aspect, 83–87% for the binary facets, and 72% for end-to-end triage. Compared with recent fine-tuned sentiment analysis systems, our pipeline cuts memory demands by two orders of magnitude and eliminates task-specific training yet narrows the accuracy gap to under five percentage points. These findings indicate that a single frozen encoder, guided by handcrafted anchors and an ultra-light head, can deliver near-human triage quality across multiple pragmatic dimensions, opening the door to low-cost, language-agnostic monitoring of digital-service feedback. Full article
(This article belongs to the Special Issue Practical Applications of Sentiment Analysis)
Show Figures

Figure 1

18 pages, 5888 KB  
Article
Incorporating Building Morphology Data to Improve Urban Land Use Mapping: A Case Study of Shenzhen
by Jiapeng Zhang, Fujun Song, Yimin Wang, Tuo Chen, Xuecao Li, Xiayu Tang, Tengyun Hu, Siyao Zhou, Han Liu, Jiaqi Wang and Mo Su
Remote Sens. 2025, 17(16), 2811; https://doi.org/10.3390/rs17162811 - 14 Aug 2025
Viewed by 361
Abstract
Accurate urban land use classification is vital for urban planning, resource allocation, and sustainable management. Traditional remote sensing methods struggle with fine-grained classification and spatial structure identification, while socio-economic data, like points of interest and road networks, face issues of uneven distribution and [...] Read more.
Accurate urban land use classification is vital for urban planning, resource allocation, and sustainable management. Traditional remote sensing methods struggle with fine-grained classification and spatial structure identification, while socio-economic data, like points of interest and road networks, face issues of uneven distribution and outdated updates. To explore the role of building morphology characteristics in enhancing urban land use classification and their potential as a substitute for socio-economic information, this study proposes a method integrating architectural features with multi-source remote sensing data, evaluated through an empirical analysis using a random forest model in Shenzhen. Three models were developed as follows: Model 1, utilizing only remote sensing data; Model 2, combining remote sensing with socio-economic data; and Model 3, integrating building morphology with remote sensing data to evaluate its potential for enhancing classification accuracy and substituting socio-economic data. Experimental results demonstrate that Model 3 achieves an overall accuracy of 80.09% and a Kappa coefficient of 0.77. Compared to this, Model 1 achieves an accuracy of 74.56% and a Kappa coefficient of 0.70, while Model 2 reaches 79.56% accuracy and a Kappa coefficient of 0.76. Model 3 also shows greater stability in complex, smaller parcels. This method offers superior generalization and substitution potential in data-scarce, heterogeneous contexts, providing a scalable approach for fine-grained urban monitoring and dynamic management. Full article
Show Figures

Figure 1

Back to TopTop