MDPI - Publisher of Open Access Journals

24 pages, 21436 KB

Open AccessArticle

ESG-YOLO: An Efficient Object Detection Algorithm for Transplant Quality Assessment of Field-Grown Tomato Seedlings Based on YOLOv8n

by Xinhui Wu, Zhenfa Dong, Can Wang, Ziyang Zhu, Yanxi Guo and Shuhe Zheng

Agronomy 2025, 15(9), 2088; https://doi.org/10.3390/agronomy15092088 - 29 Aug 2025

Abstract

Intelligent detection of tomato seedling transplant quality represents a core technology for advancing agricultural automation. However, in practical applications, existing algorithms still face numerous technical challenges, particularly with prominent issues of false detections and missed detections during recognition. To address these challenges, we [...] Read more.

Intelligent detection of tomato seedling transplant quality represents a core technology for advancing agricultural automation. However, in practical applications, existing algorithms still face numerous technical challenges, particularly with prominent issues of false detections and missed detections during recognition. To address these challenges, we developed the ESG-YOLO object detection model and successfully deployed it on edge devices, enabling real-time assessment of tomato seedling transplanting quality. Our methodology integrates three key innovations: First, an EMA (Efficient Multi-scale Attention) module is embedded within the YOLOv8 neck network to suppress interference from redundant information and enhance morphological focus on seedlings. Second, the feature fusion network is reconstructed using a GSConv-based Slim-neck architecture, achieving a lightweight neck structure compatible with edge deployment. Finally, optimization employs the GIoU (Generalized Intersection over Union) loss function to precisely localize seedling position and morphology, thereby reducing false detection and missed detection. The experimental results demonstrate that our ESG-YOLO model achieves a mean average precision mAP of 97.4%, surpassing lightweight models including YOLOv3-tiny, YOLOv5n, YOLOv7-tiny, and YOLOv8n in precision, with improvements of 9.3, 7.2, 5.7, and 2.2%, respectively. Notably, for detecting key yield-impacting categories such as “exposed seedlings” and “missed hills”, the average precision (AP) values reach 98.8 and 94.0%, respectively. To validate the model’s effectiveness on edge devices, the ESG-YOLO model was deployed on an NVIDIA Jetson TX2 NX platform, achieving a frame rate of 18.0 FPS for efficient detection of tomato seedling transplanting quality. This model provides technical support for transplanting performance assessment, enabling quality control and enhanced vegetable yield, thus actively contributing to smart agriculture initiatives. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

26 pages, 7561 KB

Open AccessArticle

Satellite Optical Target Edge Detection Based on Knowledge Distillation

by Ying Meng, Luping Zhang, Yan Zhang, Moufa Hu, Fei Zhao and Xinglin Shen

Remote Sens. 2025, 17(17), 3008; https://doi.org/10.3390/rs17173008 - 29 Aug 2025

Abstract

Edge detection of space targets is vital in aerospace applications, such as satellite monitoring and analysis, yet it faces challenges due to diverse target shapes and complex backgrounds. While deep learning-based edge detection methods dominate due to their powerful feature representation capabilities, they [...] Read more.

Edge detection of space targets is vital in aerospace applications, such as satellite monitoring and analysis, yet it faces challenges due to diverse target shapes and complex backgrounds. While deep learning-based edge detection methods dominate due to their powerful feature representation capabilities, they often suffer from large parameter sizes and lack explicit geometric prior constraints for space targets. This paper proposes a novel edge detection method for satellite targets based on knowledge distillation, namely STED-KD. Firstly, a multi-stage distillation strategy is proposed to guide a lightweight, fully convolutional network with fewer parameters to learn key features and decision boundaries from a complex teacher model, achieving model efficiency. Next, a shape prior guidance module is integrated into the student branch, incorporating geometric shape information through shape prior model construction, similarity metric calculation, and feature reconstruction, enhancing adaptability to space targets and improving detection accuracy. Additionally, a curvature-guided edge loss function is designed to ensure continuous and complete edges, minimizing local discontinuities. Experimental results on the UESD space target dataset demonstrate superior performance, with ODS, OIS, and AP scores of 0.659, 0.715, and 0.596, respectively. On the BSDS500, STED-KD achieves ODS, OIS, and AP scores of 0.818, 0.829, and 0.850, respectively, demonstrating strong competitiveness and further confirming its stability. Full article

► Show Figures

Figure 1

15 pages, 1690 KB

Open AccessArticle

OTB-YOLO: An Enhanced Lightweight YOLO Architecture for UAV-Based Maize Tassel Detection

by Yu Han, Xingya Wang, Luyan Niu, Song Shi, Yingbo Gao, Kuijie Gong, Xia Zhang and Jiye Zheng

Plants 2025, 14(17), 2701; https://doi.org/10.3390/plants14172701 - 29 Aug 2025

Abstract

To tackle the challenges posed by substantial variations in target scale, intricate background interference, and the likelihood of missing small targets in multi-temporal UAV maize tassel imagery, an optimized lightweight detection model derived from YOLOv11 is introduced, named OTB-YOLO. Here, “OTB” is an [...] Read more.

To tackle the challenges posed by substantial variations in target scale, intricate background interference, and the likelihood of missing small targets in multi-temporal UAV maize tassel imagery, an optimized lightweight detection model derived from YOLOv11 is introduced, named OTB-YOLO. Here, “OTB” is an acronym derived from the initials of the model’s core improved modules: Omni-dimensional dynamic convolution (ODConv), Triplet Attention, and Bi-directional Feature Pyramid Network (BiFPN). This model integrates the PaddlePaddle open-source maize tassel recognition benchmark dataset with the public Multi-Temporal Drone Corn Dataset (MTDC). Traditional convolutional layers are substituted with omni-dimensional dynamic convolution (ODConv) to mitigate computational redundancy. A triplet attention module is incorporated to refine feature extraction within the backbone network, while a bidirectional feature pyramid network (BiFPN) is engineered to enhance accuracy via multi-level feature pyramids and bidirectional information flow. Empirical analysis demonstrates that the enhanced model achieves a precision of 95.6%, recall of 92.1%, and mAP@0.5 of 96.6%, marking improvements of 3.2%, 2.5%, and 3.1%, respectively, over the baseline model. Concurrently, the model’s computational complexity is reduced to 6.0 GFLOPs, rendering it appropriate for deployment on UAV edge computing platforms. Full article

(This article belongs to the Special Issue Application of Remote Sensing in Crop Production and Farmland Soil Monitoring)

► Show Figures

Figure 1

20 pages, 4789 KB

Open AccessArticle

Towards Gas Plume Identification in Industrial and Livestock Farm Environments Using Infrared Hyperspectral Imaging: A Background Modeling and Suppression Method

by Zhiqiang Ning, Zhengang Li, Rong Qian and Yonghua Fang

Agriculture 2025, 15(17), 1835; https://doi.org/10.3390/agriculture15171835 - 29 Aug 2025

Viewed by 100

Abstract

Hyperspectral imaging for gas plume identification holds significant potential for applications in industrial emission control and environmental monitoring, including critical needs in livestock farm environments. Conventional pixel-by-pixel spectral identification methods primarily rely on spectral information, often overlooking the rich spatial distribution features inherent [...] Read more.

Hyperspectral imaging for gas plume identification holds significant potential for applications in industrial emission control and environmental monitoring, including critical needs in livestock farm environments. Conventional pixel-by-pixel spectral identification methods primarily rely on spectral information, often overlooking the rich spatial distribution features inherent in hyperspectral images. This oversight can lead to challenges such as inaccurate identification or leakage along the edge regions of gas plumes and false positives from isolated pixels in non-gas areas. While infrared imaging for gas plumes offers a new perspective by leveraging multi-frame image variations to locate plumes, these methods typically lack spectral discriminability. To address these limitations, we draw inspiration from the multi-frame analysis framework of infrared imaging and propose a novel hyperspectral gas plume identification method based on image background modeling and suppression. Our approach begins by employing background modeling and suppression techniques to accurately detect the primary gas plume region. Subsequently, a representative spectrum is extracted from this identified plume region for precise gas identification. To further enhance the identification accuracy, especially in the challenging plume edge regions, a spatial-spectral combined judgment operator is applied as a post-processing step. The effectiveness of the method was validated through experiments using both simulated and real-world measured data from ammonia and methanol gas releases. We compare its performance against classical methods and an ablation version of our model. Results consistently demonstrate that our method effectively detects low-concentration, thin, and diffuse gas plumes, offering a more robust and accurate solution for hyperspectral gas plume identification with strong applicability to real-world industrial and livestock farm scenarios. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

20 pages, 1732 KB

Open AccessArticle

Machine Learning Applied to Crop Mapping in Rice Varieties Using Spectral Images

by Rubén Simeón, Kenza El Masslouhi, Alba Agenjos-Moreno, Beatriz Ricarte, Antonio Uris, Belen Franch, Constanza Rubio and Alberto San Bautista

Agriculture 2025, 15(17), 1832; https://doi.org/10.3390/agriculture15171832 - 28 Aug 2025

Viewed by 84

Abstract

Global food security is increasingly challenged by climate change and the availability of arable land. This situation calls for improved crop monitoring and management strategies. Rice is a staple food for nearly half of the world’s population and a significant source of calories. [...] Read more.

Global food security is increasingly challenged by climate change and the availability of arable land. This situation calls for improved crop monitoring and management strategies. Rice is a staple food for nearly half of the world’s population and a significant source of calories. Accurately identifying rice varieties is crucial for maintaining varietal purity, planning agricultural activities, and enhancing genetic improvement strategies. This study evaluates the effectiveness of machine learning algorithms to identify the most effective approach to predicting rice varieties, using multitemporal Sentinel-2 images in the Marismas del Guadalquivir of Sevilla, Spain. Spectral reflectance data were collected from ten Sentinel-2 bands, which include visible, red-edge, near-infrared, and shortwave infrared regions, at two key phenological stages: tillering and reproduction. The models were trained on pixel-level data from the growing seasons of 2021 and 2024, and they were evaluated using a test set from 2022. Four classifiers were compared: random forest, XGBoost, K-nearest neighbors, and logistic regression. Performance was assessed based on accuracy, precision, recall, specificity and F1 score. Non-linear models outperformed linear ones. The highest performance was achieved with the Random Forest classifier during the reproduction phase, reaching an exceptional accuracy of 0.94 using all bands or only the most informative subset (red edge, NIR, and SWIR). This classifier also maintained excellent accuracy (0.93 and 0.92) during the initial tillering phase. This fact demonstrates that it is possible to perform reliable varietal mapping in the early stages of the growing season. Full article

(This article belongs to the Special Issue How Optical Sensors and Deep Learning Enhance the Production Management in Smart Agriculture)

► Show Figures

Figure 1

20 pages, 1685 KB

Open AccessArticle

Small Language Model-Guided Quantile Temporal Difference Learning for Improved IoT Application Placement in Fog Computing

by Bhargavi Krishnamurthy and Sajjan G. Shiva

Mathematics 2025, 13(17), 2768; https://doi.org/10.3390/math13172768 - 28 Aug 2025

Viewed by 183

Abstract

The global market for fog computing is expected to reach USD 6385 million by 2032. Modern enterprises rely on fog computing since it offers computational resources at edge devices through decentralized computation mechanisms. One of the crucial components of fog computing is the [...] Read more.

The global market for fog computing is expected to reach USD 6385 million by 2032. Modern enterprises rely on fog computing since it offers computational resources at edge devices through decentralized computation mechanisms. One of the crucial components of fog computing is the proper placement of applications on fog nodes (edge devices, Internet of Things (IoT)) for servicing. Large-scale, geographically distributed fog networks and heterogeneity of fog nodes make application placement a challenging task. Quantile Temporal Difference Learning (QTDL) is a promising distributed form of a reinforcement learning algorithm. It is superior compared to traditional reinforcement learning as it learns the act of prediction based on the full distribution of returns. QTDL is enriched by a small language model (SLM), which results in low inference latency, reduced costs of operation, and also enhanced rates of learning. The SLM, being a lightweight model, has policy-shaping capability, which makes it an ideal choice for the resource-constrained environment of edge devices. The data-driven quantiles of temporal difference learning are blended with the informed heuristics of the SLM to prevent quantile loss and over- or underestimation of the policies. In this paper, a novel SLM-guided QTDL framework is proposed to perform task scheduling among fog nodes. The proposed framework is implemented using the iFogSim simulator by considering both certain and uncertain fog computing environments. Further, the results obtained are validated using expected value analysis. The performance of the proposed framework is found to be satisfactory with respect of the following performance metrics: energy consumption, makespan time violations, budget violations, and load imbalance ratio. Full article

(This article belongs to the Special Issue Advanced Reinforcement Learning in Internet of Things Networks)

► Show Figures

Figure 1

18 pages, 2596 KB

Open AccessArticle

Integrating RGB Image Processing and Random Forest Algorithm to Estimate Stripe Rust Disease Severity in Wheat

by Andrzej Wójtowicz, Jan Piekarczyk, Marek Wójtowicz, Sławomir Królewicz, Ilona Świerczyńska, Katarzyna Pieczul, Jarosław Jasiewicz and Jakub Ceglarek

Remote Sens. 2025, 17(17), 2981; https://doi.org/10.3390/rs17172981 - 27 Aug 2025

Viewed by 228

Abstract

Accurate and timely assessment of crop disease severity is crucial for effective management strategies and ensuring sustainable agricultural production. Traditional visual disease scoring methods are subjective and labor-intensive, highlighting the need for automated, objective alternatives. This study evaluates the effectiveness of a model [...] Read more.

Accurate and timely assessment of crop disease severity is crucial for effective management strategies and ensuring sustainable agricultural production. Traditional visual disease scoring methods are subjective and labor-intensive, highlighting the need for automated, objective alternatives. This study evaluates the effectiveness of a model for field-based identification and quantification of stripe rust severity in wheat using red, green, blue RGB imaging. Based on crop reflectance hyperspectra (CRHS) acquired using a FieldSpec ASD spectroradiometer, two complementary approaches were developed. In the first approach, we estimate single leaf disease severity (LDS) under laboratory conditions, while in the second approach, we assess crop disease severity (CDS) from field-based RGB images. The high accuracy of both methods enabled the development of a predictive model for estimating LDS from CDS, offering a scalable solution for precision disease monitoring in wheat cultivation. The experiment was conducted on four winter wheat plots subjected to varying fungicide treatments to induce different levels of stripe rust severity for model calibration, with treatment regimes ranging from no application to three applications during the growing season. RGB images were acquired in both laboratory conditions (individual leaves) and field conditions (nadir and oblique perspectives), complemented by hyperspectral measurements in the 350–2500 nm range. To achieve automated and objective assessment of disease severity, we developed custom image-processing scripts and applied Random Forest classification and regression models. The models demonstrated high predictive performance, with the combined use of nadir and oblique RGB imagery achieving the highest classification accuracy (97.87%), sensitivity (100%), and specificity (95.83%). Oblique images were more sensitive to early-stage infection, while nadir images offered greater specificity. Spectral feature selection revealed that wavelengths in the visible (e.g., 508–563 nm and 621–703 nm) and red-edge/SWIR regions (around 1556–1767 nm) were particularly informative for disease detection. In classification models, shorter wavelengths from the visible range proved to be more useful, while in regression models, longer wavelengths were more effective. The integration of RGB-based image analysis with the Random Forest algorithm provides a robust, scalable, and cost-effective solution for monitoring stripe rust severity under field conditions. This approach holds significant potential for enhancing precision agriculture strategies by enabling early intervention and optimized fungicide application. Full article

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

► Show Figures

Figure 1

20 pages, 3459 KB

Open AccessArticle

Diagnosis of Potassium Content in Rubber Leaves Based on Spatial–Spectral Feature Fusion at the Leaf Scale

by Xiaochuan Luo, Rongnian Tang, Chuang Li and Cheng Qian

Remote Sens. 2025, 17(17), 2977; https://doi.org/10.3390/rs17172977 - 27 Aug 2025

Viewed by 283

Abstract

Hyperspectral imaging (HSI) technology has attracted extensive attention in the field of nutrient diagnosis for rubber leaves. However, the mainstream method of extracting leaf average spectra ignores the leaf spatial information in hyperspectral imaging and dilutes the response characteristics exhibited by nutrient-sensitive local [...] Read more.

Hyperspectral imaging (HSI) technology has attracted extensive attention in the field of nutrient diagnosis for rubber leaves. However, the mainstream method of extracting leaf average spectra ignores the leaf spatial information in hyperspectral imaging and dilutes the response characteristics exhibited by nutrient-sensitive local areas of leaves, thereby limiting the accuracy of modeling. This study proposes a spatial–spectral feature fusion method based on leaf-scale sub-region segmentation. It introduces a clustering algorithm to divide leaf pixel spectra into several subclasses, and segments sub-regions on the leaf surface based on clustering results. By optimizing the modeling contribution weights of leaf sub-regions, it improves the modeling and generalization accuracy of potassium diagnosis for rubber leaves. Experiments have been carried out to verify the proposed method, which is based on spatial–spectral feature fusion to outperform those of average spectral modeling. Specifically, after pixel-level MSC preprocessing, when the spectra of rubber leaf pixel regions were clustered into nine subsets, the diagnostic accuracy of potassium content in rubber leaves reached 0.97, which is better than the 0.87 achieved by average spectral modeling. Additionally, precision, macro-F1, and macro-recall all reached 0.97, which is superior to the results of average spectral modeling. Moreover, the proposed method is also superior to the spatial–spectral feature fusion method that integrates texture features. The visualization results of leaf sub-region weights showed that strengthening the modeling contribution of leaf edge regions is conducive to improving the diagnostic accuracy of potassium in rubber leaves, which is consistent with the response pattern of leaves to potassium. Full article

(This article belongs to the Special Issue Artificial Intelligence in Hyperspectral Remote Sensing Data Analysis)

► Show Figures

Figure 1

12 pages, 2172 KB

Open AccessArticle

Instance Segmentation Method for Insulators in Complex Backgrounds Based on Improved SOLOv2

by Ze Chen, Yangpeng Ji, Xiaodong Du, Shaokang Zhao, Zhenfei Huo and Xia Fang

Sensors 2025, 25(17), 5318; https://doi.org/10.3390/s25175318 - 27 Aug 2025

Viewed by 250

Abstract

To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a [...] Read more.

To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a preprocessed edge channel, generated through the Non-Subsampled Contourlet Transform (NSCT), which augments the model’s capability to accurately capture the edges of insulators. Moreover, the input image resolution to the network is heightened to 1200 × 1600, permitting more detailed extraction of edges. Rather than the original ResNet + FPN architecture, the improved HRNet is utilized as the backbone to effectively harness multi-scale feature information, thereby enhancing the model’s overall efficacy. In response to the increased input size, there is a reduction in the network’s channel count, concurrent with an increase in the number of layers, ensuring an adequate receptive field without substantially escalating network parameters. Additionally, a Convolutional Block Attention Module (CBAM) is incorporated to refine mask quality and augment object detection precision. Furthermore, to bolster the model’s robustness and minimize annotation demands, a virtual dataset is crafted utilizing the fourth-generation Unreal Engine (UE4). Empirical results reveal that the proposed framework exhibits superior performance, with AP0.50 (90.21%), AP0.75 (83.34%), and AP[0.50:0.95] (67.26%) on a test set consisting of images supplied by the power grid. This framework surpasses existing methodologies and contributes significantly to the advancement of intelligent transmission line inspection. Full article

(This article belongs to the Special Issue Recent Trends and Advances in Intelligent Fault Diagnostics)

► Show Figures

Figure 1

18 pages, 10978 KB

Open AccessArticle

A Lightweight Infrared and Visible Light Multimodal Fusion Method for Object Detection in Power Inspection

by Linghao Zhang, Junwei Kuang, Yufei Teng, Siyu Xiang, Lin Li and Yingjie Zhou

Processes 2025, 13(9), 2720; https://doi.org/10.3390/pr13092720 - 26 Aug 2025

Viewed by 277

Abstract

Visible and infrared thermal imaging are crucial techniques for detecting structural and temperature anomalies in electrical power system equipment. To meet the demand for multimodal infrared/visible light monitoring of target devices, this paper introduces CBAM-YOLOv4, an improved lightweight object detection model, which leverages [...] Read more.

Visible and infrared thermal imaging are crucial techniques for detecting structural and temperature anomalies in electrical power system equipment. To meet the demand for multimodal infrared/visible light monitoring of target devices, this paper introduces CBAM-YOLOv4, an improved lightweight object detection model, which leverages a novel synergistic integration of the Convolutional Block Attention Module (CBAM) with YOLOv4. The model employs MobileNet-v3 as the backbone to reduce parameter count, applies depthwise separable convolution to decrease computational complexity, and incorporates the CBAM module to enhance the extraction of critical optical features under complex backgrounds. Furthermore, a feature-level fusion strategy is adopted to integrate visible and infrared image information effectively. Validation on public datasets demonstrates that the proposed model achieves an 18.05 frames per second increase in detection speed over the baseline, a 1.61% improvement in mean average precision (mAP), and a 2 MB reduction in model size, substantially improving both detection accuracy and efficiency through this optimized integration in anomaly inspection of electrical equipment. Validation on a representative edge device, the NVIDIA Jetson Nano, confirms the model’s practical applicability. After INT8 quantization, the model achieves a real-time inference speed of 40.8 FPS with a high mAP of 80.91%, while consuming only 5.2 W of power. Compared to the standard YOLOv4, our model demonstrates a significant improvement in both processing efficiency and detection accuracy, offering a uniquely balanced and deployable solution for mobile inspection platforms. Full article

(This article belongs to the Special Issue Hybrid Artificial Intelligence for Smart Process Control)

► Show Figures

Figure 1

33 pages, 6933 KB

Open AccessReview

Enhancing Knowledge of Construction Safety: A Semantic Network Analysis Approach

by Yuntao Cao, Shujie Wu, Yuting Chen, Martin Skitmore, Xingguan Ma and Jun Wang

Buildings 2025, 15(17), 3036; https://doi.org/10.3390/buildings15173036 - 26 Aug 2025

Viewed by 310

Abstract

The construction industry is recognized as high-risk due to frequent accidents and injuries, prompting extensive research and bibliometric analysis of construction safety. However, little attention has been given to the evolution and interconnections of key research topics in this field. This study applies [...] Read more.

The construction industry is recognized as high-risk due to frequent accidents and injuries, prompting extensive research and bibliometric analysis of construction safety. However, little attention has been given to the evolution and interconnections of key research topics in this field. This study applies semantic network analysis (SNA) to examine relationships and trends in construction safety research over the past 30 years. SNA enables quantitative exploration of topic interrelationships that is difficult to achieve with other approaches. Chronological network graphs are evaluated using the number of nodes, edges, density, average clustering coefficient, and average path length. Prominent topics are identified through degree, betweenness, and eigenvector centrality measures. The analysis combines a global overview of the main network, a chronological perspective, and local examination of clusters based on five macro keywords: accident, safety management, worker behavior, machine learning, and safety training. Results show a shift from traditional concerns with mortality and injuries to contemporary issues, such as safety climate, worker behavior, and technological innovations, including building information modeling, machine learning, and real-time monitoring. Topics with lower centrality scores are identified as under-researched. Overall, SNA offers a comprehensive view of the construction safety knowledge system, guiding researchers toward emerging topics and helping practitioners prioritize resources and design integrated safety risk strategies. Full article

(This article belongs to the Topic Advancing Construction Safety and Health: Innovations and Strategies)

► Show Figures

Figure 1

22 pages, 5535 KB

Open AccessArticle

OFNet: Integrating Deep Optical Flow and Bi-Domain Attention for Enhanced Change Detection

by Liwen Zhang, Quan Zou, Guoqing Li, Wenyang Yu, Yong Yang and Heng Zhang

Remote Sens. 2025, 17(17), 2949; https://doi.org/10.3390/rs17172949 - 25 Aug 2025

Viewed by 355

Abstract

Change detection technology holds significant importance in disciplines such as urban planning, land utilization tracking, and hazard evaluation, as it can efficiently and accurately reveal dynamic regional change processes, providing crucial support for scientific decision-making and refined management. Although deep learning methods based [...] Read more.

Change detection technology holds significant importance in disciplines such as urban planning, land utilization tracking, and hazard evaluation, as it can efficiently and accurately reveal dynamic regional change processes, providing crucial support for scientific decision-making and refined management. Although deep learning methods based on computer vision have achieved remarkable progress in change detection, they still face challenges including reducing dynamic background interference, capturing subtle changes, and effectively fusing multi-temporal data features. To address these issues, this paper proposes a novel change detection model called OFNet. Building upon existing Siamese network architectures, we introduce an optical flow branch module that supplements pixel-level dynamic information. By incorporating motion features to guide the network’s attention to potential change regions, we enhance the model’s ability to characterize and discriminate genuine changes in cross-temporal remote sensing images. Additionally, we innovatively propose a dual-domain attention mechanism that simultaneously models discriminative features in both spatial and frequency domains for change detection tasks. The spatial attention focuses on capturing edge and structural changes, while the frequency-domain attention strengthens responses to key frequency components. The synergistic fusion of these two attention mechanisms effectively improves the model’s sensitivity to detailed changes and enhances the overall robustness of detection. Experimental results demonstrate that OFNet achieves an IoU of 83.03 on the LEVIR-CD dataset and 82.86 on the WHU-CD dataset, outperforming current mainstream approaches and validating its superior detection performance and generalization capability. This presents a novel technical method for environmental observation and urban transformation analysis tasks. Full article

(This article belongs to the Special Issue Advances in Remote Sensing Image Target Detection and Recognition)

► Show Figures

Figure 1

22 pages, 3691 KB

Open AccessArticle

Graph Convolutional Network with Agent Attention for Recognizing Digital Ink Chinese Characters Written by International Students

by Huafen Xu and Xiwen Zhang

Information 2025, 16(9), 729; https://doi.org/10.3390/info16090729 - 25 Aug 2025

Viewed by 236

Abstract

Digital ink Chinese characters (DICCs) written by international students often contain various errors and irregularities, making the recognition of these characters a highly challenging pattern recognition problem. This paper designs a graph convolutional network with agent attention (GCNAA) for recognizing DICCs written by [...] Read more.

Digital ink Chinese characters (DICCs) written by international students often contain various errors and irregularities, making the recognition of these characters a highly challenging pattern recognition problem. This paper designs a graph convolutional network with agent attention (GCNAA) for recognizing DICCs written by international students. Each sampling point is treated as a vertex in a graph, with connections between adjacent sampling points within the same stroke serving as edges to create a Chinese character graph structure. The GCNAA is used to process the data of the Chinese character graph structure, implemented by stacking Block modules. In each Block module, the graph agent attention module not only models the global context between graph nodes but also reduces computational complexity, shortens training time, and accelerates inference speed. The graph convolution block module models the local adjacency structure of the graph by aggregating local geometric information from neighboring nodes, while graph pooling is employed to learn multi-resolution features. Finally, the Softmax function is used to generate prediction results. Experiments conducted on public datasets such as CASIA-OLWHDB1.0-1.2, SCUT-COUCH2009 GB1&GB2, and HIT-OR3C-ONLINE demonstrate that the GCNAA performs well even on large-category datasets, showing strong generalization ability and robustness. The recognition accuracy for DICCs written by international students reaches 98.7%. Accurate and efficient handwritten Chinese character recognition technology can provide a solid technical foundation for computer-assisted Chinese character writing for international students, thereby promoting the development of international Chinese character education. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

21 pages, 6790 KB

Open AccessArticle

MGFormer: Super-Resolution Reconstruction of Retinal OCT Images Based on a Multi-Granularity Transformer

by Jingmin Luan, Zhe Jiao, Yutian Li, Yanru Si, Jian Liu, Yao Yu, Dongni Yang, Jia Sun, Zehao Wei and Zhenhe Ma

Photonics 2025, 12(9), 850; https://doi.org/10.3390/photonics12090850 - 25 Aug 2025

Viewed by 224

Abstract

Optical coherence tomography (OCT) acquisitions often reduce lateral sampling density to shorten scan time and suppress motion artifacts, but this strategy degrades the signal-to-noise ratio and obscures fine retinal microstructures. To recover these details without hardware modifications, we propose MGFormer, a lightweight Transformer [...] Read more.

Optical coherence tomography (OCT) acquisitions often reduce lateral sampling density to shorten scan time and suppress motion artifacts, but this strategy degrades the signal-to-noise ratio and obscures fine retinal microstructures. To recover these details without hardware modifications, we propose MGFormer, a lightweight Transformer for OCT super-resolution (SR) that integrates a multi-granularity attention mechanism with tensor distillation. A feature-enhancing convolution first sharpens edges; stacked multi-granularity attention blocks then fuse coarse-to-fine context, while a row-wise top-k operator retains the most informative tokens and preserves their positional order. We trained and evaluated MGFormer on B-scans from the Duke SD-OCT dataset at

2 \times

,

4 \times

, and

8 \times

scaling factors. Relative to seven recent CNN- and Transformer-based SR models, MGFormer achieves the highest quantitative fidelity; at

4 \times

it reaches 34.39 dB PSNR and 0.8399 SSIM, surpassing SwinIR by +0.52 dB and +0.026 SSIM, and reduces LPIPS by 21.4%. Compared with the same backbone without tensor distillation, FLOPs drop from 289G to 233G (−19.4%), and per-B-scan latency at

4 \times

falls from 166.43 ms to 98.17 ms (−41.01%); the model size remains compact (105.68 MB). A blinded reader study shows higher scores for boundary sharpness (4.2 ± 0.3), pathology discernibility (4.1 ± 0.3), and diagnostic confidence (4.3 ± 0.2), exceeding SwinIR by 0.3–0.5 points. These results suggest that MGFormer can provide fast, high-fidelity OCT SR suitable for routine clinical workflows. Full article

(This article belongs to the Section Biophotonics and Biomedical Optics)

► Show Figures

Figure 1

36 pages, 590 KB

Open AccessReview

Machine Translation in the Era of Large Language Models:A Survey of Historical and Emerging Problems

by Duygu Ataman, Alexandra Birch, Nizar Habash, Marcello Federico, Philipp Koehn and Kyunghyun Cho

Information 2025, 16(9), 723; https://doi.org/10.3390/info16090723 - 25 Aug 2025

Viewed by 680

Abstract

Historically regarded as one of the most challenging tasks presented to achieve complete artificial intelligence (AI), machine translation (MT) research has seen continuous devotion over the past decade, resulting in cutting-edge architectures for the modeling of sequential information. While the majority of statistical [...] Read more.

Historically regarded as one of the most challenging tasks presented to achieve complete artificial intelligence (AI), machine translation (MT) research has seen continuous devotion over the past decade, resulting in cutting-edge architectures for the modeling of sequential information. While the majority of statistical models traditionally relied on the idea of learning from parallel translation examples, recent research exploring self-supervised and multi-task learning methods extended the capabilities of MT models, eventually allowing the creation of general-purpose large language models (LLMs). In addition to versatility in providing translations useful across languages and domains, LLMs can in principle perform any natural language processing (NLP) task given sufficient amount of task-specific examples. While LLMs now reach a point where they can both replace and augment traditional MT models, the extent of their advantages and the ways in which they leverage translation capabilities across multilingual NLP tasks remains a wide area for exploration. In this literature survey, we present an introduction to the current position of MT research with a historical look at different modeling approaches to MT, how these might be advantageous for the solution of particular problems, and which problems are solved or remain open in regard to recent developments. We also discuss the connection of MT models leading to the development of prominent LLM architectures, how they continue to support LLM performance across different tasks by providing a means for cross-lingual knowledge transfer, and the redefinition of the task with the possibilities that LLM technology brings. Full article

(This article belongs to the Special Issue Human and Machine Translation: Recent Trends and Foundations)

► Show Figures

Figure 1

Search Results (1,878)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,878)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI