MDPI - Publisher of Open Access Journals

25 pages, 1623 KB

Open AccessArticle

Improved YOLOv8s-Based Detection for Lifting Hooks and Safety Latches

by Yunpeng Guo, Dianliang Xiao, Xin Ruan, Ran Li and Yuqian Wang

Appl. Sci. 2025, 15(18), 9878; https://doi.org/10.3390/app15189878 (registering DOI) - 9 Sep 2025

Lifting hooks equipped with safety latches are critical terminal components of lifting machinery. The safety condition of this component is a crucial factor in preventing load dislodgement during lifting operations. To achieve intelligent monitoring of the hook and the safety latch, precise identification [...] Read more.

Lifting hooks equipped with safety latches are critical terminal components of lifting machinery. The safety condition of this component is a crucial factor in preventing load dislodgement during lifting operations. To achieve intelligent monitoring of the hook and the safety latch, precise identification of these components is a crucial initial step. In this study, we propose an improved YOLOv8s detection model called YOLO-HOOK. To reduce computational complexity while simultaneously maintaining precision, the model incorporates an Efficient_Light_C2f module, which integrates a Convolutional Gated Linear Unit (CGLU) with Star Blocks. The neck network utilizes Multi-Scale Efficient Cross-Stage Partial (MSEICSP) to improve edge feature extraction capabilities under complex lighting conditions and multi-scale variations. Furthermore, a HOOK_IoU loss function was designed to optimize bounding box regression through auxiliary bounding boxes, and a piecewise linear mapping strategy was used to improve localization precision for challenging targets. The results of ablation studies and comparative analyses indicate that the YOLO-HOOK secured mAP scores of 90.4% at an Intersection over Union (IoU) threshold of 0.5 and 71.6% across the 0.5–0.95 IoU span, thereby eclipsing the YOLOv8s reference model by margins of 4.6% and 5.4%, respectively. Furthermore, it manifested a paramount precision of 97.0% alongside a commendable recall rate of 83.4%. The model parameters were reduced to 9.6 M, the computational complexity was controlled at 31.0 Giga Floating-point Operations Per Second (GFLOPs), and the inference speed reached 310 frames per second (FPS), balancing a lightweight design with excellent performance. These findings offer a technical approach for the intelligent recognition of hooks and safety latches during lifting operations, thus aiding in refining the safety management of lifting operations. Full article

21 pages, 2676 KB

Open AccessArticle

DT-HRL: Mastering Long-Sequence Manipulation with Reimagined Hierarchical Reinforcement Learning

by Junyang Zhang, Yilin Zhang, Honglin Sun, Yifei Zhang and Kenji Hashimoto

Biomimetics 2025, 10(9), 577; https://doi.org/10.3390/biomimetics10090577 - 1 Sep 2025

Viewed by 379

Abstract

Robotic manipulators in warehousing and logistics often face complex tasks that involve multiple steps, frequent task switching, and long-term dependencies. Inspired by the hierarchical structure of human motor control, this paper proposes a Hierarchical Reinforcement Learning (HRL) framework utilizing a multi-task goal-conditioned Decision [...] Read more.

Robotic manipulators in warehousing and logistics often face complex tasks that involve multiple steps, frequent task switching, and long-term dependencies. Inspired by the hierarchical structure of human motor control, this paper proposes a Hierarchical Reinforcement Learning (HRL) framework utilizing a multi-task goal-conditioned Decision Transformer (MTGC-DT). The high-level policy treats the Markov decision process as a sequence modeling task, allowing the agent to manage temporal dependencies. The low-level policy is made up of parameterized action primitives that handle physical execution. This design improves long-term reasoning and generalization. This method is evaluated on two common logistics manipulation tasks: sequential stacking and spatial sorting with sparse reward and low-quality dataset. The main contributions include introducing a HRL framework that integrates Decision Transformer (DT) with task and goal embeddings, along with a path-efficiency loss (PEL) correction and designing a parameterized, learnable primitive skill library for low-level control to enhance generalization and reusability. Experimental results demonstrate that the proposed Decision Transformer-based Hierarchical Reinforcement Learning (DT-HRL) achieves over a 10% higher success rate and over 8% average reward compared with the baseline, and a normalized score increase of over 2% in the ablation experiments. Full article

(This article belongs to the Section Locomotion and Bioinspired Robotics)

► Show Figures

Figure 1

54 pages, 11409 KB

Open AccessArticle

FracFusionNet: A Multi-Level Feature Fusion Convolutional Network for Bone Fracture Detection in Radiographic Images

by Sameh Abd El-Ghany, Mahmood A. Mahmood and A. A. Abd El-Aziz

Diagnostics 2025, 15(17), 2212; https://doi.org/10.3390/diagnostics15172212 - 31 Aug 2025

Viewed by 460

Abstract

Background/Objectives: Bones are essential components of the human body, providing structural support, enabling mobility, storing minerals, and protecting internal organs. Bone fractures (BFs) are common injuries that result from excessive physical force and can lead to serious complications, including bleeding, infection, impaired oxygenation, [...] Read more.

Background/Objectives: Bones are essential components of the human body, providing structural support, enabling mobility, storing minerals, and protecting internal organs. Bone fractures (BFs) are common injuries that result from excessive physical force and can lead to serious complications, including bleeding, infection, impaired oxygenation, and long-term disability. Early and accurate identification of fractures through radiographic imaging is critical for effective treatment and improved patient outcomes. However, manual evaluation of X-rays is often time-consuming and prone to diagnostic errors due to human limitations. To address this, artificial intelligence (AI), particularly deep learning (DL), has emerged as a powerful tool for enhancing diagnostic precision in medical imaging. Methods: This research introduces a novel convolutional neural network (CNN) model, the Multi-Level Feature Fusion Network (MLFNet), designed to capture and integrate both low-level and high-level image features. The model was evaluated using the Bone Fracture Multi-Region X-ray (BFMRX) dataset. Preprocessing steps included image normalization, resizing, and contrast enhancement to ensure stable convergence, reduce sensitivity to lighting variations in radiographic images, and maintain consistency. Ablation studies were conducted to assess architectural variations, confirming the model’s robustness and generalizability across data distributions. MLFNet’s high accuracy, interpretability, and efficiency make it a promising solution for clinical deployment. Results: MLFNet achieved an impressive accuracy of 99.60% as a standalone model and 98.81% when integrated into hybrid ensemble architectures with five leading pre-trained DL models. Conclusions: The proposed approach supports timely and precise fracture detection, optimizing the diagnostic process and reducing healthcare costs. This approach offers significant potential to aid clinicians in fields such as orthopedics and radiology, contributing to more equitable and effective patient care. Full article

(This article belongs to the Special Issue Machine-Learning-Based Disease Diagnosis and Prediction)

► Show Figures

Figure 1

17 pages, 588 KB

Open AccessArticle

Diffusion-Inspired Masked Language Modeling for Symbolic Harmony Generation on a Fixed Time Grid

by Maximos Kaliakatsos-Papakostas, Dimos Makris, Konstantinos Soiledis, Konstantinos-Theodoros Tsamis, Vassilis Katsouros and Emilios Cambouropoulos

Appl. Sci. 2025, 15(17), 9513; https://doi.org/10.3390/app15179513 - 29 Aug 2025

Viewed by 264

Abstract

We present a novel encoder-only Transformer model for symbolic music harmony generation, based on a fixed time-grid representation of melody and harmony. Inspired by denoising diffusion processes, our model progressively unmasks harmony tokens over a sequence of discrete stages, learning to reconstruct the [...] Read more.

We present a novel encoder-only Transformer model for symbolic music harmony generation, based on a fixed time-grid representation of melody and harmony. Inspired by denoising diffusion processes, our model progressively unmasks harmony tokens over a sequence of discrete stages, learning to reconstruct the full harmonic structure from partial context. Unlike autoregressive models, this formulation enables flexible, non-sequential generation and supports explicit control over harmony placement. The model is stage-aware, receiving timestep embeddings analogous to diffusion timesteps, and is conditioned on both a binary piano roll and a pitch class roll to capture melodic context. We explore two unmasking schedules—random token revealing and midpoint doubling—both requiring a fixed and significantly reduced number of model calls at inference time. While our approach achieves competitive performance with strong autoregressive baselines (GPT-2 and BART) across several harmonic metrics, its key advantages lie in controllability, structured decoding with fixed inference steps, and alignment with musical structure. Ablation studies further highlight the role of stage awareness and pitch class conditioning. Our results position this method as a viable and interpretable alternative for symbolic harmony generation and a foundation for future work on structured, controllable musical modeling. Full article

(This article belongs to the Special Issue The Age of Transformers: Emerging Trends and Applications)

► Show Figures

Figure 1

20 pages, 4789 KB

Open AccessArticle

Towards Gas Plume Identification in Industrial and Livestock Farm Environments Using Infrared Hyperspectral Imaging: A Background Modeling and Suppression Method

by Zhiqiang Ning, Zhengang Li, Rong Qian and Yonghua Fang

Agriculture 2025, 15(17), 1835; https://doi.org/10.3390/agriculture15171835 - 29 Aug 2025

Viewed by 439

Abstract

Hyperspectral imaging for gas plume identification holds significant potential for applications in industrial emission control and environmental monitoring, including critical needs in livestock farm environments. Conventional pixel-by-pixel spectral identification methods primarily rely on spectral information, often overlooking the rich spatial distribution features inherent [...] Read more.

Hyperspectral imaging for gas plume identification holds significant potential for applications in industrial emission control and environmental monitoring, including critical needs in livestock farm environments. Conventional pixel-by-pixel spectral identification methods primarily rely on spectral information, often overlooking the rich spatial distribution features inherent in hyperspectral images. This oversight can lead to challenges such as inaccurate identification or leakage along the edge regions of gas plumes and false positives from isolated pixels in non-gas areas. While infrared imaging for gas plumes offers a new perspective by leveraging multi-frame image variations to locate plumes, these methods typically lack spectral discriminability. To address these limitations, we draw inspiration from the multi-frame analysis framework of infrared imaging and propose a novel hyperspectral gas plume identification method based on image background modeling and suppression. Our approach begins by employing background modeling and suppression techniques to accurately detect the primary gas plume region. Subsequently, a representative spectrum is extracted from this identified plume region for precise gas identification. To further enhance the identification accuracy, especially in the challenging plume edge regions, a spatial-spectral combined judgment operator is applied as a post-processing step. The effectiveness of the method was validated through experiments using both simulated and real-world measured data from ammonia and methanol gas releases. We compare its performance against classical methods and an ablation version of our model. Results consistently demonstrate that our method effectively detects low-concentration, thin, and diffuse gas plumes, offering a more robust and accurate solution for hyperspectral gas plume identification with strong applicability to real-world industrial and livestock farm scenarios. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

29 pages, 6541 KB

Open AccessArticle

A Novel Spatio-Temporal Graph Convolutional Network with Attention Mechanism for PM_2.5 Concentration Prediction

by Xin Guan, Xinyue Mo and Huan Li

Mach. Learn. Knowl. Extr. 2025, 7(3), 88; https://doi.org/10.3390/make7030088 - 27 Aug 2025

Viewed by 515

Abstract

Accurate and high-resolution spatio-temporal prediction of PM_2.5 concentrations remains a significant challenge for air pollution early warning and prevention. Advanced artificial intelligence (AI) technologies, however, offer promising solutions to this problem. A spatio-temporal prediction model is designed in this study, which is [...] Read more.

Accurate and high-resolution spatio-temporal prediction of PM_2.5 concentrations remains a significant challenge for air pollution early warning and prevention. Advanced artificial intelligence (AI) technologies, however, offer promising solutions to this problem. A spatio-temporal prediction model is designed in this study, which is built upon a seq2seq architecture. This model employs an improved graph convolutional neural network to capture spatially dependent features, integrates time-series information through a gated recurrent unit, and incorporates an attention mechanism to achieve PM_2.5 concentration prediction. Benefiting from high-resolution satellite remote sensing data, the regional, multi-step and high-resolution prediction of PM_2.5 concentration in Beijing has been performed. To validate the model’s performance, ablation experiments are conducted, and the model is compared with other advanced prediction models. The experimental results show our proposed Spatio-Temporal Graph Convolutional Network with Attention Mechanism (STGCA) outperforms comparison models in multi-step forecasting, achieving root mean squared error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of 4.21, 3.11 and 11.41% for the first step, respectively. For subsequent steps, the model also shows significant improvements. For subsequent steps, the model also shows significant improvements, with RMSE, MAE and MAPE values of 5.08, 3.69 and 13.34% for the second step and 6.54, 4.61 and 16.62% for the third step, respectively. Additionally, STGCA achieves the index of agreement (IA) values of 0.98, 0.97 and 0.95, as well as Theil’s inequality coefficient (TIC) values of 0.06, 0.08 and 0.10 proving its superiority. These results demonstrate that the proposed model offers an efficient technical approach for smart air pollution forecasting and warning in the future. Full article

► Show Figures

Figure 1

20 pages, 1818 KB

Open AccessArticle

Image Captioning Model Based on Multi-Step Cross-Attention Cross-Modal Alignment and External Commonsense Knowledge Augmentation

by Liang Wang, Meiqing Jiao, Zhihai Li, Mengxue Zhang, Haiyan Wei, Yuru Ma, Honghui An, Jiaqi Lin and Jun Wang

Electronics 2025, 14(16), 3325; https://doi.org/10.3390/electronics14163325 - 21 Aug 2025

Viewed by 646

Abstract

To address the semantic mismatch between limited textual descriptions in image captioning training datasets and the multi-semantic nature of images, as well as the underutilized external commonsense knowledge, this article proposes a novel image captioning model based on multi-step cross-attention cross-modal alignment and [...] Read more.

To address the semantic mismatch between limited textual descriptions in image captioning training datasets and the multi-semantic nature of images, as well as the underutilized external commonsense knowledge, this article proposes a novel image captioning model based on multi-step cross-attention cross-modal alignment and external commonsense knowledge enhancement. The model employs a backbone architecture comprising CLIP’s ViT visual encoder, Faster R-CNN, BERT text encoder, and GPT-2 text decoder. It incorporates two core mechanisms: a multi-step cross-attention mechanism that iteratively aligns image and text features across multiple rounds, progressively enhancing inter-modal semantic consistency for more accurate cross-modal representation fusion. Moreover, the model employs Faster R-CNN to extract region-based object features. These features are mapped to corresponding entities within the dataset through entity probability calculation and entity linking. External commonsense knowledge associated with these entities is then retrieved from the ConceptNet knowledge graph, followed by knowledge embedding via TransE and multi-hop reasoning. Finally, the fused multimodal features are fed into the GPT-2 decoder to steer caption generation, enhancing the lexical richness, factual accuracy, and cognitive plausibility of the generated descriptions. In the experiments, the model achieves CIDEr scores of 142.6 on MSCOCO and 78.4 on Flickr30k. Ablations confirm both modules enhance caption quality. Full article

► Show Figures

Figure 1

23 pages, 1562 KB

Open AccessArticle

SCNOC-Agentic: A Network Operation and Control Agentic for Satellite Communication Systems

by Wenyu Sun, Chenhua Sun, Yasheng Zhang, Zhan Yin and Zhifeng Kang

Electronics 2025, 14(16), 3320; https://doi.org/10.3390/electronics14163320 - 20 Aug 2025

Viewed by 538

Abstract

Large language models (LLMs) have demonstrated powerful capability to solve practical problems through complex step-by-step reasoning. Specifically designed LLMs have begun to be integrated into terrestrial communication networks. However, relevant research in the field of satellite communications remains exceedingly rare. To address this [...] Read more.

Large language models (LLMs) have demonstrated powerful capability to solve practical problems through complex step-by-step reasoning. Specifically designed LLMs have begun to be integrated into terrestrial communication networks. However, relevant research in the field of satellite communications remains exceedingly rare. To address this gap, we introduce SCNOC-Agentic, a novel architecture especially designed to integrate the management and control of satellite communication systems in LLMs. SCNOC-Agentic incorporates four components tailored to the characteristics of satellite communications: intent refinement, multi-agent workflow, personalized long-term memory, and graph-based retrieval. Furthermore, we define four typical real-world scenarios that can be effectively addressed by integrating with LLMs: network task planning, carrier and cell optimization, fault analysis of satellites, and satellite management and control. Utilizing the SCNOC-Agentic framework, a series of open-source LLMs have achieved outstanding performance on the four tasks under various baselines, including zero-shot CoT, CoT-5, and self-consistency. For example, qwen2.5-70B with SCNOC-Agentic significantly improves the parameter generation accuracy in the network task planning task from 15.6% to 32.2%, while llama-3.3-70B increases from 16.2% to 29.0%. In addition, ablation studies were conducted to validate the importance of each proposed component within the SCNOC-Agentic framework. Full article

(This article belongs to the Special Issue Satellite Terrestrial Networks: Technologies, Security and Applications)

► Show Figures

Figure 1

15 pages, 4099 KB

Open AccessArticle

Pulsed Laser Annealing of Deposited Amorphous Carbon Films

by Arianna D. Rivera, Eitan Hershkovitz, Panagiotis Panoutsopoulos, Manny X. de Jesus Lopez, Bradley Simpson, Honggyu Kim, Rajaram Narayanan, Jesse Johnson and Kevin S. Jones

C 2025, 11(3), 60; https://doi.org/10.3390/c11030060 - 8 Aug 2025

Viewed by 629

Abstract

Pulsed laser annealing (PLA) was performed on a 0.3 μm thick hydrogenated amorphous carbon (a-C:H) film deposited on silicon substrate by plasma-enhanced chemical vapor deposition (PECVD). The 532 nm, 32 ns PLA ranged in fluence from 0.2 to 0.94 J cm⁻². [...] Read more.

Pulsed laser annealing (PLA) was performed on a 0.3 μm thick hydrogenated amorphous carbon (a-C:H) film deposited on silicon substrate by plasma-enhanced chemical vapor deposition (PECVD). The 532 nm, 32 ns PLA ranged in fluence from 0.2 to 0.94 J cm⁻². There were no visible signs of film delamination over the entire fluence range for a single pulse. As the fluence increased, graphitization of the amorphous film bulk was observed. However, at the near surface of the film, there was a concomitant increase in sp³ content. The sp³ bonding observed is the result of the formation of a thin diamond-like layer on the surface of the carbon film. Along with increasing laser fluence, the film swelled by 75% up to 0.6 J cm⁻². In addition, carbon fiber formation was observed at 0.6 J cm⁻², increasing in size and depth up through 0.94 J cm⁻². The origin of this transformation may be associated with a rapid outgassing of hydrogen from the amorphous carbon during the PLA step. Additionally, there was a dramatic increase in the visible light absorption of these thin films with increasing laser fluence, despite the films being less than a micron thick. These results suggest that PLA of a-C:H film is a useful method for modifying the surface structure for optical or electrochemical applications without film ablation. Full article

(This article belongs to the Special Issue Carbon Functionalization: From Synthesis to Applications)

► Show Figures

Figure 1

16 pages, 3373 KB

Open AccessArticle

Knowledge-Augmented Zero-Shot Method for Power Equipment Defect Grading with Chain-of-Thought LLMs

by Jianguang Du, Bo Li, Zhenyu Chen, Lian Shen, Pufan Liu and Zhongyang Ran

Electronics 2025, 14(15), 3101; https://doi.org/10.3390/electronics14153101 - 4 Aug 2025

Viewed by 419

Abstract

As large language models (LLMs) increasingly enter specialized domains, inference without external resources often leads to knowledge gaps, opaque reasoning, and hallucinations. To address these challenges in power equipment defect grading, we propose a zero-shot question-answering framework that requires no task-specific examples. Our [...] Read more.

As large language models (LLMs) increasingly enter specialized domains, inference without external resources often leads to knowledge gaps, opaque reasoning, and hallucinations. To address these challenges in power equipment defect grading, we propose a zero-shot question-answering framework that requires no task-specific examples. Our system performs two-stage retrieval—first using a Sentence-BERT model fine-tuned on power equipment maintenance texts for coarse filtering, then combining TF-IDF and semantic re-ranking for fine-grained selection of the most relevant knowledge snippets. We embed both the user query and the retrieved evidence into a Chain-of-Thought (CoT) prompt, guiding the pre-trained LLM through multi-step reasoning with self-validation and without any model fine-tuning. Experimental results show that on a held-out test set of 218 inspection records, our method achieves a grading accuracy of 54.2%, which is 6.0 percentage points higher than the fine-tuned BERT baseline at 48.2%; an Explanation Coherence Score (ECS) of 4.2 compared to 3.1 for the baseline; a mean retrieval latency of 28.3 ms; and an average LLM inference time of 5.46 s. Ablation and sensitivity analyses demonstrate that a fine-stage retrieval pool size of k = 30 offers the optimal trade-off between accuracy and latency; human expert evaluation by six senior engineers yields average Usefulness and Trustworthiness scores of 4.1 and 4.3, respectively. Case studies across representative defect scenarios further highlight the system’s robust zero-shot performance. Full article

(This article belongs to the Special Issue Recent Progress in Visual AI: Architectures, Learning, and Applications)

► Show Figures

Figure 1

25 pages, 1183 KB

Open AccessArticle

A Novel Data-Driven Multi-Branch LSTM Architecture with Attention Mechanisms for Forecasting Electric Vehicle Adoption

by Md Mizanur Rahaman, Md Rashedul Islam, Mia Md Tofayel Gonee Manik, Md Munna Aziz, Inshad Rahman Noman, Mohammad Muzahidur Rahman Bhuiyan, Kanchon Kumar Bishnu and Joy Chakra Bortty

World Electr. Veh. J. 2025, 16(8), 432; https://doi.org/10.3390/wevj16080432 - 1 Aug 2025

Viewed by 455

Abstract

Accurately predicting how quickly people will adopt electric vehicles (EVs) is vital for planning charging stations, managing supply chains, and shaping climate policy. We present a forecasting model that uses three separate Long Short-Term Memory (LSTM) branches—one for past EV sales, one for [...] Read more.

Accurately predicting how quickly people will adopt electric vehicles (EVs) is vital for planning charging stations, managing supply chains, and shaping climate policy. We present a forecasting model that uses three separate Long Short-Term Memory (LSTM) branches—one for past EV sales, one for infrastructure and policy signals, and one for economic trends. An attention mechanism first highlights the most important weeks in each branch, then decides which branch matters most at any point in time. Trained end-to-end on publicly available data, the model beats traditional statistical methods and newer deep learning baselines while remaining small enough to run efficiently. An ablation study shows that every branch and both attention steps improve accuracy, and that adding policy and economic data helps more than relying on EV history alone. Because the network is modular and its attention weights are easy to interpret, it can be extended to produce confidence intervals, include physical constraints, or forecast adoption of other clean-energy technologies. Full article

► Show Figures

Figure 1

11 pages, 261 KB

Open AccessReview

Minimally Invasive Surgical Strategies for the Treatment of Atrial Fibrillation: An Evolving Role in Contemporary Cardiac Surgery

by Luciana Benvegnù, Giorgia Cibin, Fabiola Perrone, Vincenzo Tarzia, Augusto D’Onofrio, Giovanni Battista Luciani, Gino Gerosa and Francesco Onorati

J. Cardiovasc. Dev. Dis. 2025, 12(8), 289; https://doi.org/10.3390/jcdd12080289 - 29 Jul 2025

Viewed by 576

Abstract

Atrial fibrillation remains the most frequent sustained arrhythmia, particularly in the elderly population, and is associated with increased risks of stroke, heart failure, and reduced quality of life. While catheter ablation is widely used for rhythm control, its efficacy is limited in persistent [...] Read more.

Atrial fibrillation remains the most frequent sustained arrhythmia, particularly in the elderly population, and is associated with increased risks of stroke, heart failure, and reduced quality of life. While catheter ablation is widely used for rhythm control, its efficacy is limited in persistent and long-standing atrial fibrillation. Over the past two decades, minimally invasive surgical strategies have emerged as effective alternatives, aiming to replicate the success of the Cox-Maze procedure while reducing surgical trauma. This overview critically summarizes the current minimally invasive techniques available for atrial fibrillation treatment, including mini-thoracotomy ablation, thoracoscopic ablation, and hybrid procedures such as the convergent approach. These methods offer the potential for durable sinus rhythm restoration by enabling direct visualization, transmural lesion creation, and left atrial appendage exclusion, with lower perioperative morbidity compared to traditional open surgery. The choice of energy source plays a key role in lesion efficacy and safety. Particular attention is given to the technical steps of each procedure, patient selection criteria, and the role of left atrial appendage closure in stroke prevention. Hybrid strategies, which combine epicardial surgical ablation with endocardial catheter-based procedures, have shown encouraging outcomes in patients with refractory or long-standing atrial fibrillation. Despite the steep learning curve, minimally invasive techniques provide significant benefits in terms of recovery time, reduced hospital stay, and fewer complications. As evidence continues to evolve, these approaches represent a key advancement in the surgical management of atrial fibrillation, deserving integration into contemporary treatment algorithms and multidisciplinary heart team planning. Full article

(This article belongs to the Special Issue Hybrid Ablation of the Atrial Fibrillation)

► Show Figures

Graphical abstract

21 pages, 2904 KB

Open AccessArticle

A Lightweight Greenhouse Tomato Fruit Identification Method Based on Improved YOLOv11n

by Xingyu Gao, Fengyu Li, Jun Yan, Qinyou Sun, Xianyong Meng and Pingzeng Liu

Agriculture 2025, 15(14), 1497; https://doi.org/10.3390/agriculture15141497 - 11 Jul 2025

Viewed by 509

Abstract

The aim of this paper is to propose an improved lightweight YOLOv11 detection method in response to the difficulty of extracting tomato fruit features in greenhouse environments and the need for lightweight picking equipment. Firstly, the conventional step convolution is substituted by the [...] Read more.

The aim of this paper is to propose an improved lightweight YOLOv11 detection method in response to the difficulty of extracting tomato fruit features in greenhouse environments and the need for lightweight picking equipment. Firstly, the conventional step convolution is substituted by the Average pooling Downsampling (ADown) module with multi-path fusion; Gated Convolution (gConv) is incorporated in the C3K2 module, which considerably reduces the number of parameters and computation of the model. Concurrently, the Lightweight Shared Convolutional Detection (LSCD) is incorporated into the detection head component with to the aim of further reducing the computational complexity. Finally, the Wise–Powerful intersection over Union (Wise-PIoU) loss function is employed to optimise the model accuracy, and the effectiveness of each improvement module is verified by means of ablation experiments. The experimental results demonstrate that the precision of ACLW-YOLO (A stands for ADown, C stands for C3K2_gConv, L stands for LSCD, and W stands for Wise-PIoU) reaches 94.2%, the recall (R) is 92.0%, and the mean average precision (mAP) is 95.2%. Meanwhile, the model size is only 3.3 MB, the number of parameters is 1.6 M, and the floating-point computation is 3.9 GFLOPs. The ACLW-YOLO model enhances the precision of detection through its lightweight design, while concurrently achieving a substantial reduction in computational complexity and memory utilisation. The study demonstrates that the enhanced model exhibits superior recognition performance for various tomato fruits, thereby providing a robust theoretical and technical foundation for the automation of greenhouse tomato picking processes. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

14 pages, 3449 KB

Open AccessArticle

Superhydrophobic Coating on 6061 Aluminum Alloy Fabricated by Femtosecond Laser Etching and Anodic Oxidation

by Quanlv Liu and Yuxin Wang

Coatings 2025, 15(7), 816; https://doi.org/10.3390/coatings15070816 - 11 Jul 2025

Viewed by 662

Abstract

A superhydrophobic surface with hierarchical micro/nano-array structures was successfully fabricated on 6061 aluminum alloy through a combination of femtosecond laser etching and anodic oxidation. Femtosecond laser etching formed a regularly arranged microscale “pit-protrusion” array on the aluminum alloy surface. After modification with a [...] Read more.

A superhydrophobic surface with hierarchical micro/nano-array structures was successfully fabricated on 6061 aluminum alloy through a combination of femtosecond laser etching and anodic oxidation. Femtosecond laser etching formed a regularly arranged microscale “pit-protrusion” array on the aluminum alloy surface. After modification with a fluorosilane ethanol solution, the surface exhibited superhydrophobicity with a contact angle of 154°. Subsequently, the anodic oxidation process formed an anodic oxide film dominated by an array of aluminum oxide (Al₂O₃) nanopores at the submicron scale. Scanning electron microscopy (SEM) and X-ray diffraction (XRD) analyses revealed that the nanopore structures uniformly and continuously covered the laser-ablated layer. This hierarchical structure significantly increased the surface water contact angle to 162°. Wettability analysis showed that the prepared composite coating formed an air layer accounting for 91% of the surface area. Compared with the sample only treated by femtosecond laser etching, the presence of the Al₂O₃ nanopore structure significantly enhanced the mechanical durability, superhydrophobic durability, and corrosion resistance of the superhydrophobic surface. The proposed multi-step fabrication strategy offers an innovative method for creating multifunctional, durable superhydrophobic coatings and has important implications for their large-scale industrial use. Full article

(This article belongs to the Special Issue Superhydrophobic Coatings, 2nd Edition)

► Show Figures

Figure 1

25 pages, 4334 KB

Open AccessArticle

Multi-Task Learning-Based Traffic Flow Prediction Through Highway Toll Stations During Holidays

by Xiaowei Liu, Yunfan Zhang, Zhongyi Han, Hao Qiu, Shuxin Zhang and Jinlei Zhang

Technologies 2025, 13(7), 287; https://doi.org/10.3390/technologies13070287 - 4 Jul 2025

Viewed by 478

Abstract

Accurate traffic flow prediction is essential for highway operations, especially during holidays when surging traffic poses significant challenges. This study focuses on holiday traffic and introduces a spatiotemporal cross-attention network (ST-Cross-Attn) that combines a bidirectional convolutional LSTM (Bi-ConvLSTM) with a cross-attention module to [...] Read more.

Accurate traffic flow prediction is essential for highway operations, especially during holidays when surging traffic poses significant challenges. This study focuses on holiday traffic and introduces a spatiotemporal cross-attention network (ST-Cross-Attn) that combines a bidirectional convolutional LSTM (Bi-ConvLSTM) with a cross-attention module to jointly predict toll station inbound flow and outbound flow. Under the multi-task learning framework, the model shares spatial–temporal features between inbound flow and outbound flow, enhancing their representations and improving multi-step prediction accuracy. Using three years of highway traffic flow data during Labor Day from Shandong, China, ST-Cross-Attn outperformed eight state-of-the-art benchmarks, achieving an average improvement of 4.34% in inbound flow prediction and 2.3% in outbound flow prediction. Extensive ablation studies further confirmed the effectiveness of the model’s components and multi-task learning framework, demonstrating its potential for reliable holiday traffic forecasting. Full article

► Show Figures

Figure 1

Search Results (238)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (238)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI