MDPI - Publisher of Open Access Journals

23 pages, 1281 KB

Open AccessArticle

Digital-Twin-Oriented Virtual Training Environment for Agricultural Robot Navigation: A Vineyard Rover Case Study

by Gábor Kusper, Zoltán Barócsi, Péter Csóka, Krisztián Vajda and József Sütő

Sensors 2026, 26(12), 3766; https://doi.org/10.3390/s26123766 (registering DOI) - 12 Jun 2026

A virtual training environment offers clear advantages for agricultural robotics. It provides a safe setting in which perception, navigation, and control algorithms can be evaluated without risking damage to either the robot or the crop. It also supports efficient data generation: large volumes [...] Read more.

A virtual training environment offers clear advantages for agricultural robotics. It provides a safe setting in which perception, navigation, and control algorithms can be evaluated without risking damage to either the robot or the crop. It also supports efficient data generation: large volumes of training data can be collected under diverse environmental conditions that would be costly, slow, and often season-dependent in real-world deployments. This broader variability improves model adaptability, reduces the risk of overfitting, and leads to more robust operation. In this paper, we argue that digital twin technology should therefore be understood not merely as a passive mirror of a physical robot, but as an active training environment in which multiple sensor-related subprocesses can be developed, tested, validated, and refined jointly. This paper is based on our experiences with digital twin technology used in the development of a vineyard robot, including a self-driving rover, sensor simulation, procedural map generation, and agriculture-specific movement models. Our contribution is threefold: we reinterpret the digital twin as a training space, propose a layered framework for training agricultural robots in virtual environments, and explain why agriculture is a particularly strong use case, given variable field conditions, expensive real-world experimentation, and persistent labor scarcity. To validate this framework, we present the simulation-based evaluation of an autonomous reinforcement learning agent. The agent has been trained entirely in this virtual environment, which successfully navigated to 155 out of 161 target points in a simulated vineyard demonstration environment. Full article

(This article belongs to the Special Issue Applications of Sensors Based on Embedded Systems)

27 pages, 7550 KB

Open AccessArticle

A Hybrid Inverse Kinematics Framework for Biomimetic Redundancy Resolution in 7-DoF Humanoid Arms

by Yapeng Shi, Zhen Chen, Ivan Mokiets, Songhao Piao, Teng Zhang and Lianzhao Zhang

Biomimetics 2026, 11(6), 408; https://doi.org/10.3390/biomimetics11060408 - 9 Jun 2026

Viewed by 123

Abstract

Resolving the kinematic redundancy of 7-DoF humanoid arms to generate natural, human-like motions remains a fundamental challenge in biomimetic robotics. This paper presents a hybrid inverse kinematics (IK) framework that learns a pose-dependent redundancy parameter and integrates it into a differential IK solver. [...] Read more.

Resolving the kinematic redundancy of 7-DoF humanoid arms to generate natural, human-like motions remains a fundamental challenge in biomimetic robotics. This paper presents a hybrid inverse kinematics (IK) framework that learns a pose-dependent redundancy parameter and integrates it into a differential IK solver. Specifically, we employ the stereographic Shoulder–Elbow–Wrist (SEW) angle as a well-conditioned geometric parameterization. This formulation transforms the algorithmic singularity into a unidirectional half-line, which can be oriented outside the typical reachable workspace. To specify the optimal configuration within the self-motion manifold, a motion dataset was collected by teleoperating a humanoid arm via an anthropomorphic wearable exoskeleton. This approach translates operator-specific postural preferences into the robot’s joint space. A lightweight neural network was then trained to learn the mapping from end-effector poses to these operator-specific SEW angles. By incorporating the predicted SEW angle as a dynamic secondary objective in the null space of the primary tracking task, the proposed framework enables natural redundancy resolution while preserving end-effector tracking accuracy. Both simulations and real-robot experiments were conducted to validate the approach. Results show that, compared to the average performance of static fixed-parameter strategies, the proposed method improves the Joint Configuration Quality Index (CQI) by 22.5% and reduces energy costs by 11.3%. Moreover, the sub-millisecond inference latency (0.44 ms) facilitates seamless integration into real-time control pipelines. Full article

(This article belongs to the Special Issue Biologically Inspired Design and Control of Robots: Third Edition)

► Show Figures

Graphical abstract

19 pages, 5903 KB

Open AccessArticle

Quality Detection for Dragon Fruit Based on the End-of-Arm Spectral Sensor of the Harvesting Robot

by Zongxiu Bai, Qiu Xu, Kairan Lou and Bin Zhang

Foods 2026, 15(11), 1944; https://doi.org/10.3390/foods15111944 - 1 Jun 2026

Viewed by 182

Abstract

Carrying out quality grading detection on the harvested dragon fruit is an important step in the dragon fruit industry. To reduce the high costs and damage rates caused by this process, an online spectral sensor and a weighing sensor embedded at the end [...] Read more.

Carrying out quality grading detection on the harvested dragon fruit is an important step in the dragon fruit industry. To reduce the high costs and damage rates caused by this process, an online spectral sensor and a weighing sensor embedded at the end effector of the dragon fruit-picking robot were designed to detect the sugar content, hardness and weight of the dragon fruits in real time during the picking process, thereby achieving the quality classification of the dragon fruits. After collecting the spectral data of dragon fruit, typical linear and nonlinear machine learning methods were used to establish prediction models for SSC-edge, SSC-center and hardness of dragon fruit. The results showed that PLSR models were selected as optimal models for prediction sugar content and hardness, and R² of test set for SSC-edge, SSC-center and hardness are 0.876, 0.826 and 0.902, respectively. Subsequently, the dragon fruits were classified based on the weighing sensor, and the SSC-center and hardness were predicted. The results showed that the established quality prediction model and the prototype could achieve the integrated operation of non-destructive quality detection and grading of dragon fruit during picking. The study provides technical support for the intelligent upgrade of fruit-harvesting equipment and the grading operations. Full article

(This article belongs to the Special Issue Advanced Technology for Rapid Comprehensive Analysis of the Food Composition)

► Show Figures

Graphical abstract

38 pages, 46338 KB

Open AccessArticle

A Lightweight Real-Time Tomato Leaf Disease Detection System for Edge-Based Smart Agriculture

by Rong Zhao, Fei Deng, Haohua Que, Mingkai Liu, Xiejia Yue and Lei Mu

Sensors 2026, 26(11), 3474; https://doi.org/10.3390/s26113474 - 31 May 2026

Viewed by 460

Abstract

Tomato leaf diseases substantially reduce tomato yields and quality and remain a persistent challenge for efficient crop management. Although deep learning-based detectors have achieved strong accuracy in controlled benchmarks, many existing solutions are still difficult to transfer to resource-constrained agricultural systems because they [...] Read more.

Tomato leaf diseases substantially reduce tomato yields and quality and remain a persistent challenge for efficient crop management. Although deep learning-based detectors have achieved strong accuracy in controlled benchmarks, many existing solutions are still difficult to transfer to resource-constrained agricultural systems because they rely on high-end GPUs, consume considerable power, and often lose performance after deployment on embedded devices. To address this practical gap, this study proposes HGS-YOLO, a system-oriented deployable lightweight adaptation of YOLOv11 for leaf-level tomato disease detection, together with an end-to-end edge sensing pipeline for low-power agricultural deployment. The main contribution lies in the coordinated system-level co-design of model structure, optimization, and deployment rather than in a novel detector architecture. Specifically, YOLOv11 is adapted through three coordinated modifications: an HGNetV2 backbone for efficient feature extraction, an HS-FPN neck with channel attention for lightweight multi-scale fusion, and an MPDIoU loss function for more stable localization optimization. Beyond the model architecture, the study establishes a complete engineering pipeline that includes training, optimization, post-training quantization, and hardware deployment with BPU acceleration on a D-Robotics RDK X5 handheld platform. Comprehensive benchmark experiments indicate that HGS-YOLO achieves 93.6%

{mAP}_{50}

and 72.1% mAP@[0.5:0.95] with 86.5% recall, only 1.3 M parameters, and a 3.1 MB model size, substantially reducing the model complexity and storage cost relative to the YOLOv11 baseline. A three-seed retraining comparison shows that HGS-YOLO trades roughly 0.5

{mAP}_{50}

points for this compactness (a statistically significant but small concession) and recovers the cost on the deployment side: on the RDK X5 chip, HGS-YOLO is the fastest, most memory-efficient, and lowest-power model among all compared detectors. Indoor deployment tests using separately collected tomato leaf samples further achieve 90.3%

{mAP}_{50}

, 82.3% recall, 89.0% precision, 25.0 ± 0.4 ms end-to-end latency, 40.0 ± 0.6 FPS, and 9.8 ± 0.4 W average system power. After PTQ, the

{mAP}_{50}

drops from 93.6% to 93.0% on the same benchmark; because this figure was measured under controlled imaging conditions, it is presented as an in-distribution reference point rather than as evidence of robustness in the open field. We also took the handheld system into a working tomato greenhouse for a small outdoor field round, where it ran end-to-end and produced on-device disease detections under natural sunlight, specular highlights, partial occlusion, background clutter, and handheld motion blur. These results show that HGS-YOLO reaches a good balance of accuracy, efficiency, and deployability and that it works in the field on an independent small-scale test; validating it more widely across sites, seasons, and weather is left to future work. Full article

(This article belongs to the Special Issue From Innovation to Field Adoption: Sensing and Robotic Systems in Smart Agriculture)

► Show Figures

Figure 1

24 pages, 8861 KB

Open AccessArticle

BerryFlowerNet: A Customized Convolutional Neural Network for Blueberry Flower Cluster Detection and Flowering Stage Prediction with a Field Phenotyping Robot

by Chenjiao Tan, Nolan Gao, Ye Chu and Changying Li

Agriculture 2026, 16(11), 1159; https://doi.org/10.3390/agriculture16111159 - 25 May 2026

Viewed by 293

Abstract

Blueberry production has rapidly expanded over the past decade, accompanied by growing demand for efficient and accurate methods to monitor the flowering and fruiting phases of blueberry development, which has a direct impact on yield potential. Accurate determination of blueberry phenology enables growers [...] Read more.

Blueberry production has rapidly expanded over the past decade, accompanied by growing demand for efficient and accurate methods to monitor the flowering and fruiting phases of blueberry development, which has a direct impact on yield potential. Accurate determination of blueberry phenology enables growers to make data-driven decisions on freeze protection applications and harvest windows. In addition, objective phenology data of blueberry mapping populations will provide high-quality phenotype data for the discovery of genetic mechanisms regulating blueberry flowering and fruiting times. Traditional approaches, such as manual counting and visual ratings, are labor-intensive and subjective in capturing variation across genotypes. Recent progress in computer vision and deep learning has enabled automated flower detection, but most existing studies on blueberries remain restricted to narrow flowering windows or close-up images, limiting their application at the bush level and across the seasonal development. In this study, we developed BerryFlowerNet, a customized YOLO-based model to detect and count blueberry flower clusters from bud to green fruit stages. A comprehensive dataset was collected on three dates using a field phenotyping robot, covering five flowering stages. The integration of CFNet, a custom module fusing shallow spatial features, and PIoU loss improved the detection performance. Additionally, the Slicing Aided Hyper Inference algorithm was employed to address small-object detection in bush-level images. Experimental results demonstrated that BerryFlowerNet outperformed the baseline YOLO model and three additional detectors, achieving an average mAP0.5 of 0.644 across five independent training runs. The model achieved an accuracy of 0.88 when predicting blueberry flowering stages, indicating its effectiveness and accuracy. Additionally, the results of the bush-level image analysis showed the capability of the model to capture genotype-level differences in flowering dynamics. Overall, this approach offers new opportunities for growers and breeders to determine blueberry phenological development that is critical for optimizing on-farm management strategies and advancing precision phenotyping to facilitate the development of climate-resilient blueberries. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

38 pages, 730 KB

Open AccessReview

Artificial Intelligence Applications in Implant Positioning, Dislocation Risk Prediction, and Surgical Indications in Orthopaedic Surgery

by Mihai Emanuel Gherghe, Alex-Gabriel Grigore, Iosif-Aliodor Timofticiuc, Adelina-Elena Moise, Constantin-Adrian Andrei, Serban Dragosloveanu, Dana-Georgiana Nedelea, Łukasz Pulik, Catalin Anghel, Cristian Scheau and Romica Cergan

Bioengineering 2026, 13(6), 610; https://doi.org/10.3390/bioengineering13060610 - 23 May 2026

Viewed by 417

Abstract

Background: Artificial intelligence (AI) is becoming increasingly integrated into orthopaedic surgery for tasks such as implant positioning, dislocation risk prediction, and surgical decision-making. However, the current evidence varies widely across anatomical regions and applications. Methods: A structured narrative review was conducted using PubMed [...] Read more.

Background: Artificial intelligence (AI) is becoming increasingly integrated into orthopaedic surgery for tasks such as implant positioning, dislocation risk prediction, and surgical decision-making. However, the current evidence varies widely across anatomical regions and applications. Methods: A structured narrative review was conducted using PubMed and Web of Science Core Collection to identify studies applying machine learning or deep learning in orthopaedic procedures, focusing on parameters such as the anatomical region addressed, data types used, primary AI tasks, evaluation designs, and validation strategies. Reviews and meta-analyses were excluded. Study selection was summarized using a PRISMA-style flow diagram, and included studies were narratively synthesized according to anatomical region, AI task, imaging modality, validation strategy, and clinical relevance. Results: We identified three main application areas: (1) AI in imaging-driven planning and implant positioning, often linked with navigation or robotic systems; (2) postoperative evaluation related to implants; and (3) prediction of clinically relevant outcomes such as dislocation risk. The strongest evidence is found in hip arthroplasty, where AI improves measurement accuracy and workflow efficiency, whereas applications in knee, shoulder, and spine surgery are less developed and often supported by smaller studies. Although existing risk prediction models demonstrate good performance, their generalizability is hindered by limited external validation and inconsistent reporting. Conclusions: Overall, while AI shows significant promise in enhancing various aspects of orthopaedic surgery, stronger links between technical advancements and patient outcomes are needed. Future research should prioritize extensive validations, workflow-aware evaluations, failure analysis, and adherence to AI-specific reporting guidelines to facilitate safe and effective clinical implementation. Full article

(This article belongs to the Special Issue Deep Learning for Medical Applications: Challenges and Opportunities)

► Show Figures

Figure 1

21 pages, 3948 KB

Open AccessArticle

Demonstrating Data-to-Knowledge Pipelines for Connecting Production Sites in the World Wide Lab

by Leon Gorissen, Jan-Niklas Schneider, Mohamed Behery, Philipp Brauner, Moritz Lennartz, David Kötter, Thomas Kaster, Oliver Petrovic, Christian Hinke, Thomas Gries, Gerhard Lakemeyer, Martina Ziefle, Christian Brecher and Constantin Häfner

Mach. Learn. Knowl. Extr. 2026, 8(5), 136; https://doi.org/10.3390/make8050136 - 20 May 2026

Viewed by 331

Abstract

The digital transformation of production requires methods for integrating, storing, and operationalizing data across organizational boundaries, yet most existing approaches remain siloed and unidirectional, lacking a systematic loop from raw data to actionable knowledge and back. We introduce Data-to-Knowledge (D2K) and Knowledge-to-Data (K2D) [...] Read more.

The digital transformation of production requires methods for integrating, storing, and operationalizing data across organizational boundaries, yet most existing approaches remain siloed and unidirectional, lacking a systematic loop from raw data to actionable knowledge and back. We introduce Data-to-Knowledge (D2K) and Knowledge-to-Data (K2D) pipelines as a universal production concept built on networks of Digital Shadows. The Data-to-Knowledge (D2K) pipeline is realized as a cross-organizational proof of concept that captures and semantically annotates robotic trajectory data from three independent research institutes and uses those data to train an inverse-dynamics foundation model for robot control. Centralized aggregation via an existing FAIR-compliant research data repository was chosen deliberately over federated alternatives to maximize semantic interoperability and reuse of shared infrastructure; federated and privacy-preserving extensions are identified as a promising future direction. Fine-tuning the cross-organizationally trained foundation model reduces training time by approximately 85% relative to end-to-end training from scratch, while achieving comparable accuracy on a standardized inverse-dynamics benchmark. These gains are attributable to the combination of cross-site data aggregation and transfer learning; isolating the contribution of semantic annotation alone remains a topic for future ablation work. The implementation demonstrates that semantically enriched, cross-organizational D2K pipelines can accelerate model development and reduce redundant data collection within a constrained but practically relevant class of robotics tasks. We further discuss limitations, governance challenges, and how these pipelines can contribute to a broader World Wide Lab for collaborative production research. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

25 pages, 6089 KB

Open AccessArticle

MKT-GMM: A Motion Knowledge Transferring Framework for Robot Trajectory Adaptation to Variable Via-Points

by Congcong Ye, Chengxing Wu, Miao Luo, Lunping Li and Xu Tang

Biomimetics 2026, 11(5), 351; https://doi.org/10.3390/biomimetics11050351 - 19 May 2026

Viewed by 342

Abstract

Human motion provides a valuable source of information for robotic skill acquisition, and Learning from Demonstration (LfD) has been widely adopted as an intuitive paradigm for enabling robots to learn tasks from human demonstrations. However, the lack of an explicit representation of transferable [...] Read more.

Human motion provides a valuable source of information for robotic skill acquisition, and Learning from Demonstration (LfD) has been widely adopted as an intuitive paradigm for enabling robots to learn tasks from human demonstrations. However, the lack of an explicit representation of transferable motion knowledge significantly limits the adaptability of LfD when tasks involve varying spatial constraints or environmental configurations. To address this challenge, this paper proposes a motion representation framework based on two fundamental properties of motion and introduces a novel Motion Knowledge Transferring Gaussian Mixture Model (MKT-GMM) for trajectory generalization across related tasks. In the proposed framework, demonstration trajectories from a source task are first collected through kinesthetic teaching and encoded using a Gaussian Mixture Model (GMM), where each Gaussian component represents a local motion primitive. Transferable motion knowledge is captured by jointly preserving the statistical characteristics of individual motion primitives and the geometric relationships between adjacent primitives. For a target task in which only task constraints are specified, the learned motion knowledge is transferred by adapting the GMM parameters through affine transformations combined with constraint-error minimization, enabling feasible trajectories to be generated without additional demonstrations or model retraining. The final motions are reconstructed using Gaussian Mixture Regression (GMR), ensuring smooth and consistent trajectory generation. To further improve the robustness of trajectory transfer, a pseudo via-point mechanism is introduced to automatically generate intermediate constraints when explicit via-points are unavailable. Experiments conducted on a robotic manipulation platform, including handwriting motion learning and pick-and-place tasks under varying task configurations, demonstrate that the proposed method effectively captures transferable motion knowledge and achieves reliable trajectory generalization for previously unseen tasks. Full article

(This article belongs to the Section Bioinspired Sensorics, Information Processing and Control)

► Show Figures

Figure 1

21 pages, 3302 KB

Open AccessArticle

Integrating Vision–Language–Action Models and RGB-D Sensing for Robotic Waste Sorting on KUKA LBR iiwa

by Teresa Sinico, Daniele Businaro and Giovanni Boschetti

Robotics 2026, 15(5), 100; https://doi.org/10.3390/robotics15050100 - 18 May 2026

Viewed by 391

Abstract

Robotic waste sorting presents significant challenges, including object variability, cluttered environments, and the predominant reliance on deep learning and traditional computer vision techniques, which typically demand extensive datasets and task-specific training. This paper introduces a robotic waste sorting system that integrates the Gemini [...] Read more.

Robotic waste sorting presents significant challenges, including object variability, cluttered environments, and the predominant reliance on deep learning and traditional computer vision techniques, which typically demand extensive datasets and task-specific training. This paper introduces a robotic waste sorting system that integrates the Gemini Vision–Language–Action (VLA) model with a KUKA LBR iiwa collaborative robot and an RGB-D camera. Our approach leverages the advanced reasoning capabilities of large, pre-trained VLA models to perform waste sorting, without requiring explicit training or dataset collection. Key contributions include the development of effective prompt engineering strategies for waste object identification, the assessment of the VLA’s performance in terms of inference time and accuracy, and the development of different grasping strategies for operation in cluttered scenarios. Our experimental tests demonstrated that the system’s inference time is between 2 and 4 s, which is suitable for collaborative robotic applications, and the system achieved a high overall classification accuracy of 89.64%. Crucially, we demonstrated that integration of RGB-D sensing enhanced the model’s ability to perceive object heights, resolve occlusions, and make informed grasping decisions in realistic, three-dimensional settings. We further validated multiple real-world grasping strategies, demonstrating tradeoffs between system efficiency and safety in heavily cluttered scenarios. This work establishes a practical and adaptable framework for deploying VLA-driven intelligence on commercial robotic platforms, highlighting the potential of VLAs for complex manipulation tasks beyond waste sorting. Full article

(This article belongs to the Special Issue IFToMM for Sustainable Development Goals: Contributions from I4SDG 2025 Conference)

► Show Figures

Figure 1

27 pages, 1021 KB

Open AccessArticle

Application of Deep Learning for the Classification of Activities of Daily Living Using Sensor Data

by Kajetan Jeznach and Piotr Falkowski

Appl. Sci. 2026, 16(10), 4958; https://doi.org/10.3390/app16104958 - 15 May 2026

Viewed by 213

Abstract

The growing integration of rehabilitation robotics and artificial intelligence has created new opportunities for developing control strategies that better support clinicians during patient therapy. This study investigates machine learning and deep learning approaches for classifying upper limb motion using encoder-based biomechanical data, with [...] Read more.

The growing integration of rehabilitation robotics and artificial intelligence has created new opportunities for developing control strategies that better support clinicians during patient therapy. This study investigates machine learning and deep learning approaches for classifying upper limb motion using encoder-based biomechanical data, with the goal of identifying a model suitable for implementation in a rehabilitation exoskeleton. Several classical algorithms such as k-Nearest Neighbors, Random Forest, multiclass logistic regression, XGBoost, and an SVM classifier were evaluated alongside three deep learning architectures: convolutional layers, GRU and LSTM units. Models were trained and tested on two types of datasets using both standard cross-validation and leave-one-subject-out validation. The analysis included assessments of class separability, signal features’ importance, and comparative performance based on F1-score, accuracy, and confusion matrices. Results showed notable differences between validation strategies, with LOSO evaluation revealing limitations of the available dataset and emphasising the need for broader data collection. Overall, the findings indicate that, in the LOSO evaluation of the five-class multi-subject dataset—the most clinically realistic validation scenario—the LSTM-based model achieved the highest generalisation performance (accuracy 92.8%, macro-F1 0.927), supporting its suitability for integration into exoskeleton control systems aimed at detecting and mitigating compensatory movements. Full article

(This article belongs to the Special Issue Current Advances in Rehabilitation Technology)

► Show Figures

Figure 1

11 pages, 1329 KB

Open AccessProceeding Paper

Neuromorphic AI-Based e-Skin for Emotion-Sensitive Humanoid Robots

by Shubham Gupta and Suhaib Ahmed

Eng. Proc. 2026, 124(1), 114; https://doi.org/10.3390/engproc2026124114 - 7 May 2026

Viewed by 765

Abstract

Humanoid robots operating in proximity to humans require the ability to perceive and interpret emotional cues conveyed through touch to achieve safe, natural, and socially intelligent interaction. Conventional tactile sensing systems primarily focus on force or pressure detection and cannot infer affective intent, [...] Read more.

Humanoid robots operating in proximity to humans require the ability to perceive and interpret emotional cues conveyed through touch to achieve safe, natural, and socially intelligent interaction. Conventional tactile sensing systems primarily focus on force or pressure detection and cannot infer affective intent, while frame-based deep learning models often suffer from high latency and energy consumption when deployed on embedded platforms. To address these limitations, this paper presents a neuromorphic AI-based multimodal electronic skin (e-skin) framework for emotion-sensitive touch perception in humanoid robots. The proposed system integrates pressure, temperature, and electrostatic sensing with a bio-inspired signal conditioning pipeline and a Spiking Neural Network (SNN) for event-driven, low-power processing. A custom multimodal tactile dataset was collected using the proposed e-skin prototype to model four emotional touch interactions: stress, neutral, comfort, and affection. Experimental results demonstrate that the proposed approach achieves a high emotion classification accuracy of up to 92%, with an average accuracy of 88.75% across all classes. The neuromorphic SNN significantly reduces inference latency to approximately 8 ms, compared to 38 ms for a conventional CNN-based model, while maintaining energy-efficient operation suitable for edge deployment. The results validate the effectiveness of combining multimodal tactile sensing with neuromorphic processing to enable real-time, emotion-aware human–robot interaction. Full article

(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

21 pages, 3383 KB

Open AccessArticle

A Synthetic Data Generation Framework for the Development of Computer Vision Applications in Manufacturing

by Kosmas Alexopoulos, Christos Manettas, Dimitrios Tsikos and Nikolaos Nikolakis

Appl. Sci. 2026, 16(9), 4388; https://doi.org/10.3390/app16094388 - 30 Apr 2026

Viewed by 652

Abstract

Machine learning techniques are increasingly used for computer vision applications in manufacturing. Synthetic data, generated through realistic simulations, are utilized to accelerate the data collection process while optimizing accuracy and precision of ML models. However, in manufacturing there is usually the need for [...] Read more.

Machine learning techniques are increasingly used for computer vision applications in manufacturing. Synthetic data, generated through realistic simulations, are utilized to accelerate the data collection process while optimizing accuracy and precision of ML models. However, in manufacturing there is usually the need for the development of several CV applications that support different production steps. This obstacle requires a systematic approach for generating synthetic datasets that can be used for developing effective CV systems. Hence, this work presents a pipeline for generating photorealistic synthetic datasets, using a set of digital tools such as 3D modeling, photorealistic rendering, automated labeling, and ML training tools. The proposed framework is tested and validated in a robot-assisted packaging case in the dairy industry. The industrial use case provides a pilot-level demonstration that the synthetic dataset generation framework can support the development of CV modules across several production steps and thus it can aid in accelerating commissioning and reconfiguration of industrial automation setups. Moreover, the pilot validation indicates that object detection and recognition models trained on synthetic data can provide sufficient performance for the specific requirements of the examined packaging scenario. Full article

► Show Figures

Figure 1

33 pages, 4500 KB

Open AccessArticle

Evaluating the Impact of VR Training Strategies on HRI Cooperative Assembly Performance

by Paola Farina, Valentina De Simone, Salvatore Miranda and Valentina Di Pasquale

Appl. Sci. 2026, 16(9), 4305; https://doi.org/10.3390/app16094305 - 28 Apr 2026

Viewed by 499

Abstract

Virtual Reality (VR) has emerged as a powerful tool for improving training strategies in advanced manufacturing through immersive experiences. Within this context, this study examines the impact of two training strategies, VR and Video-Based (VB) instructions, on system performance (execution time and human [...] Read more.

Virtual Reality (VR) has emerged as a powerful tool for improving training strategies in advanced manufacturing through immersive experiences. Within this context, this study examines the impact of two training strategies, VR and Video-Based (VB) instructions, on system performance (execution time and human errors) in a cooperative Human–Robot Interaction (HRI) assembly task. Overall, 26 participants completed the task after receiving either VR or VB training, and a sub-sample of 6 people per group returned one month later to repeat the task, enabling an evaluation of performance over time. Objective and subjective metrics were collected, and statistical and effect size analyses were conducted to compare training effects across sessions. Results show that execution times and number of errors were comparable between VR and VB in the first real session. After one month, both groups exhibited improved performance, but VR-trained participants retained, on average, lower error rates, with a 71% reduction and the number of errors dropping to zero, and more stable error patterns, whereas VB-trained participants displayed greater variability and occasional accuracy degradation during repeated task execution. Moreover, within-group comparisons show that VR training is more effective for accuracy-critical cooperative HRI tasks. At the same time, VB remains a low-cost option for time-focused contexts, shedding light on how training modalities influence learning and forgetting in Industry 5.0. Full article

► Show Figures

Figure 1

24 pages, 8644 KB

Open AccessArticle

YOLO-REFB: Rectangular Edge Fusion for Cardboard Box Detection in Warehouse Environments Using Mobile Robot

by Narendra Kumar Kolla and Pandu Ranga Vundavilli

Modelling 2026, 7(3), 83; https://doi.org/10.3390/modelling7030083 - 28 Apr 2026

Viewed by 728

Abstract

Accurate detection of cardboard boxes is essential to mobile manipulators to perform pick-and-place operations in warehouses. Conventional object detection methods like YOLOv11 struggle in low-texture and occluded environments. This paper presents YOLO-REFB, a novel object detection framework for real-time cardboard box detection in [...] Read more.

Accurate detection of cardboard boxes is essential to mobile manipulators to perform pick-and-place operations in warehouses. Conventional object detection methods like YOLOv11 struggle in low-texture and occluded environments. This paper presents YOLO-REFB, a novel object detection framework for real-time cardboard box detection in robotic manipulation using a dual-arm mobile robot (DAMR) operating in indoor warehouse environments. The proposed approach enhances the network by integrating the Rectangular Edge Fusion Block (REFB) into the YOLOv11 architecture; it focuses on learning the geometric and structural features of cardboard boxes. Enhanced edge information extraction and feature fusion improve training stability and localization accuracy. A custom dataset of 3501 annotated images, collected under varied conditions, was utilized. The images were randomly assigned to training and validation sets while keeping an 80:20 ratio. They were manually annotated and trained using Roboflow software, ensuring precise alignment of bounding boxes with cardboard box edges for accurate comparison with existing YOLO models. The model outperformed existing YOLO variants (YOLOv8n and YOLOv5n) in terms of precision (89.29%), recall (83.95%), and F1-score (86.54%). YOLO-REFB achieved improved localization metrics, including mean Average Precision (mAP)@0.5 (91.68%) and mAP@0.5:0.95 (68.61%). The inclusion of REFB was essential to performance gains, enabling effective detection of objects in challenging environments. Future developments may include 3D pose estimation and multi-object grasp planning for advanced robotic manipulation. Full article

► Show Figures

Figure 1

32 pages, 18066 KB

Open AccessArticle

Grapevine Winter Pruning Point Localization Using YOLO-Based Instance Segmentation

by Magdalena Kapłan and Kamil Buczyński

Agriculture 2026, 16(9), 943; https://doi.org/10.3390/agriculture16090943 - 24 Apr 2026

Viewed by 1028

Abstract

Winter pruning is a key management practice in viticulture that directly affects vine architecture, yield balance, and grape quality. At the same time, it is a highly labor-intensive operation, and the selective identification of appropriate cutting locations remains one of the main challenges [...] Read more.

Winter pruning is a key management practice in viticulture that directly affects vine architecture, yield balance, and grape quality. At the same time, it is a highly labor-intensive operation, and the selective identification of appropriate cutting locations remains one of the main challenges limiting the automation of pruning in vineyards. Advances in machine vision provide new opportunities to support the development of robotic pruning systems. The objective of this study was to develop and evaluate a vision-based method for estimating grapevine pruning points and cutting lines using instance segmentation outputs generated by YOLO models. A dataset of 1500 RGB images of dormant grapevines was collected under field conditions in the Nobilis vineyard located in southeastern Poland. Two annotation strategies were implemented to define pruning regions. YOLO-based instance segmentation models were trained and evaluated for detecting cutting-related structures. Based on the predicted segmentation masks, a geometry-based method termed PCAcutSeg-V was developed to estimate class-dependent cutting points and cutting lines using principal component analysis applied to object contours. The results indicate that YOLOv8 and YOLO11 architectures achieved the highest segmentation performance among the evaluated models. The simplified annotation strategy provided more stable geometric inputs for the PCAcutSeg-V method, enabling more reliable estimation of cutting points and cutting lines compared with the extended annotation approach. When combined with the PCAcutSeg-V method, the proposed perception–geometry pipeline achieved high effectiveness in pruning decision estimation. The method was further implemented in a real-time processing pipeline using an RGB camera and an edge computing platform, where it maintained performance consistent with the results obtained from offline image analysis. These findings demonstrate that combining deep learning-based instance segmentation with deterministic geometric reasoning enables accurate and interpretable estimation of grapevine pruning locations and provides a promising foundation for future autonomous pruning systems. Full article

(This article belongs to the Special Issue Adapting Horticultural Plant Cultivation Technology and Storage to Changing Conditions)

► Show Figures

Figure 1

Search Results (410)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (410)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI