Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (591)

Search Parameters:
Keywords = Siamese

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 4099 KB  
Article
A Transformer-Based Multi-Scale Semantic Extraction Change Detection Network for Building Change Application
by Lujin Hu, Senchuan Di, Zhenkai Wang and Yu Liu
Buildings 2025, 15(19), 3549; https://doi.org/10.3390/buildings15193549 - 2 Oct 2025
Abstract
Building change detection involves identifying areas where buildings have changed by comparing multi-temporal remote sensing imagery of the same geographical region. Recent advances in Transformer-based methods have significantly improved remote sensing change detection. However, current Transformer models still exhibit persistent limitations in effectively [...] Read more.
Building change detection involves identifying areas where buildings have changed by comparing multi-temporal remote sensing imagery of the same geographical region. Recent advances in Transformer-based methods have significantly improved remote sensing change detection. However, current Transformer models still exhibit persistent limitations in effectively extracting multi-scale semantic features within complex scenarios. To more effectively extract multi-scale semantic features in complex scenes, we propose a novel model, which is the Transformer-based Multi-Scale Semantic Extraction Change Detection Network (MSSE-CDNet). The model employs a Siamese network architecture to enable precise change recognition. MSSE-CDNet comprises four parts, which together contain five modules: (1) a CNN feature extraction module, (2) a multi-scale semantic extraction module, (3) a Transformer encoder and decoder module, and (4) a prediction module. Comprehensive experiments on the standard LEVIR-CD benchmark for building change detection demonstrate our approach’s superiority over state-of-the-art methods. Compared to existing models such as FC-Siam-Di, FC-Siam-Conc, DTCTSCN, BIT, and SNUNet, MSSE-CDNet achieves significant and consistent gains in performance metrics, with F1 scores improved by 4.22%, 6.84%, 2.86%, 1.22%, and 2.37%, respectively, and Intersection over Union (IoU) improved by 6.78%, 10.74%, 4.65%, 2.02%, and 3.87%, respectively. These results robustly substantiate the effectiveness of our framework on an established benchmark dataset. Full article
(This article belongs to the Special Issue Big Data and Machine/Deep Learning in Construction)
Show Figures

Figure 1

29 pages, 23948 KB  
Article
CAGMC-Defence: A Cross-Attention-Guided Multimodal Collaborative Defence Method for Multimodal Remote Sensing Image Target Recognition
by Jiahao Cui, Hang Cao, Lingquan Meng, Wang Guo, Keyi Zhang, Qi Wang, Cheng Chang and Haifeng Li
Remote Sens. 2025, 17(19), 3300; https://doi.org/10.3390/rs17193300 - 25 Sep 2025
Abstract
With the increasing diversity of remote sensing modalities, multimodal image fusion improves target recognition accuracy but also introduces new security risks. Adversaries can inject small, imperceptible perturbations into a single modality to mislead model predictions, which undermines system reliability. Most existing defences are [...] Read more.
With the increasing diversity of remote sensing modalities, multimodal image fusion improves target recognition accuracy but also introduces new security risks. Adversaries can inject small, imperceptible perturbations into a single modality to mislead model predictions, which undermines system reliability. Most existing defences are designed for single-modal inputs and face two key challenges in multimodal settings: 1. vulnerability to perturbation propagation due to static fusion strategies, and 2. the lack of collaborative mechanisms that limit overall robustness according to the weakest modality. To address these issues, we propose CAGMC-Defence, a cross-attention-guided multimodal collaborative defence framework for multimodal remote sensing. It contains two main modules. The Multimodal Feature Enhancement and Fusion (MFEF) module adopts a pseudo-Siamese network and cross-attention to decouple features, capture intermodal dependencies, and suppress perturbation propagation through weighted regulation and consistency alignment. The Multimodal Adversarial Training (MAT) module jointly generates optical and SAR adversarial examples and optimizes network parameters under consistency loss, enhancing robustness and generalization. Experiments on the WHU-OPT-SAR dataset show that CAGMC-Defence maintains stable performance under various typical adversarial attacks, such as FGSM, PGD, and MIM, retaining 85.74% overall accuracy even under the strongest white-box MIM attack (ϵ=0.05), significantly outperforming existing multimodal defence baselines. Full article
Show Figures

Figure 1

18 pages, 960 KB  
Article
Fus: Combining Semantic and Structural Graph Information for Binary Code Similarity Detection
by Yanlin Li, Taiyan Wang, Lu Yu and Zulie Pan
Electronics 2025, 14(19), 3781; https://doi.org/10.3390/electronics14193781 - 24 Sep 2025
Viewed by 49
Abstract
Binary code similarity detection (BCSD) plays an important role in software security. Recent deep learning-based methods have made great progress. Existing methods based on a single feature, such as semantics or graph structure, struggle to handle changes caused by the architecture or compilation [...] Read more.
Binary code similarity detection (BCSD) plays an important role in software security. Recent deep learning-based methods have made great progress. Existing methods based on a single feature, such as semantics or graph structure, struggle to handle changes caused by the architecture or compilation environment. Methods fusing semantics and graph structure suffer from insufficient learning of the function, resulting in low accuracy and robustness. To address this issue, we proposed Fus, a method that integrates semantic information from the pseudo-C code and structural features from the Abstract Syntax Tree (AST). The pseudo-C code and AST are robust against compilation and architectural changes and can represent the function well. Our approach consists of three steps. First, we preprocess the assembly code to obtain the pseudo-C code and AST for each function. Second, we employ a Siamese network with CodeBERT models to extract semantic embeddings from the pseudo-C code and Tree-Structured Long Short-Term Memory (Tree LSTM) to encode the AST. Finally, function similarity is computed by summing the respective semantic and structural similarities. The evaluation results show that our method outperforms the state-of-the-art methods in most scenarios. Especially in large-scale scenarios, its performance is remarkable. In the vulnerability search task, Fus achieves the highest recall. It demonstrates the accuracy and robustness of our method. Full article
Show Figures

Figure 1

25 pages, 12760 KB  
Article
Intelligent Face Recognition: Comprehensive Feature Extraction Methods for Holistic Face Analysis and Modalities
by Thoalfeqar G. Jarullah, Ahmad Saeed Mohammad, Musab T. S. Al-Kaltakchi and Jabir Alshehabi Al-Ani
Signals 2025, 6(3), 49; https://doi.org/10.3390/signals6030049 - 19 Sep 2025
Viewed by 428
Abstract
Face recognition technology utilizes unique facial features to analyze and compare individuals for identification and verification purposes. This technology is crucial for several reasons, such as improving security and authentication, effectively verifying identities, providing personalized user experiences, and automating various operations, including attendance [...] Read more.
Face recognition technology utilizes unique facial features to analyze and compare individuals for identification and verification purposes. This technology is crucial for several reasons, such as improving security and authentication, effectively verifying identities, providing personalized user experiences, and automating various operations, including attendance monitoring, access management, and law enforcement activities. In this paper, comprehensive evaluations are conducted using different face detection and modality segmentation methods, feature extraction methods, and classifiers to improve system performance. As for face detection, four methods are proposed: OpenCV’s Haar Cascade classifier, Dlib’s HOG + SVM frontal face detector, Dlib’s CNN face detector, and Mediapipe’s face detector. Additionally, two types of feature extraction techniques are proposed: hand-crafted features (traditional methods: global local features) and deep learning features. Three global features were extracted, Scale-Invariant Feature Transform (SIFT), Speeded Robust Features (SURF), and Global Image Structure (GIST). Likewise, the following local feature methods are utilized: Local Binary Pattern (LBP), Weber local descriptor (WLD), and Histogram of Oriented Gradients (HOG). On the other hand, the deep learning-based features fall into two categories: convolutional neural networks (CNNs), including VGG16, VGG19, and VGG-Face, and Siamese neural networks (SNNs), which generate face embeddings. For classification, three methods are employed: Support Vector Machine (SVM), a one-class SVM variant, and Multilayer Perceptron (MLP). The system is evaluated on three datasets: in-house, Labelled Faces in the Wild (LFW), and the Pins dataset (sourced from Pinterest) providing comprehensive benchmark comparisons for facial recognition research. The best performance accuracy for the proposed ten-feature extraction methods applied to the in-house database in the context of the facial recognition task achieved 99.8% accuracy by using the VGG16 model combined with the SVM classifier. Full article
Show Figures

Figure 1

20 pages, 2172 KB  
Article
Securing Smart Grids: A Triplet Loss Function Siamese Network-Based Approach for Detecting Electricity Theft in Power Utilities
by Touqeer Ahmed, Muhammad Salman Saeed, Muhammad I. Masud, Zeeshan Ahmad Arfeen, Mazhar Baloch, Mohammed Aman and Mohsin Shahzad
Energies 2025, 18(18), 4957; https://doi.org/10.3390/en18184957 - 18 Sep 2025
Viewed by 211
Abstract
Electricity theft in power grids results in significant economic losses for utility companies. While machine learning (ML) methods have shown promising results in detecting such frauds, they often suffer from low detection rates, leading to excessive physical inspections. In this study, we attempted [...] Read more.
Electricity theft in power grids results in significant economic losses for utility companies. While machine learning (ML) methods have shown promising results in detecting such frauds, they often suffer from low detection rates, leading to excessive physical inspections. In this study, we attempted to solve the above-mentioned problem using a novel approach. The proposed framework utilizes the intelligence of Siamese network architecture with the Triplet Loss function to detect electricity theft using a labeled dataset obtained from Multan Electric Power Company (MEPCO), Pakistan. The proposed method involves analyzing and comparing the consumption patterns of honest and fraudulent consumers, enabling the model to distinguish between the two categories with enhanced accuracy and detection rates. We incorporate advanced feature extraction techniques and data mining methods to transform raw consumption data into informative features, such as time-based consumption profiles and anomalous load behaviors, which are crucial for detecting abnormal patterns in electricity consumption. The refined dataset is then used to train the Siamese network, where the Triplet Loss function optimizes the model by maximizing the distance between dissimilar (fraudulent and honest) consumption patterns while minimizing the distance among similar ones. The results demonstrate that our proposed solution outperforms traditional methods by significantly improving accuracy (95.4%) and precision (92%). Eventually, the integration of feature extraction with Siamese networks and Triplet Loss offers a scalable and robust framework for enhancing the security and operational efficiency of power grids. Full article
Show Figures

Figure 1

20 pages, 55265 KB  
Article
Learning Precise Mask Representation for Siamese Visual Tracking
by Peng Yang, Fen Hu, Qinghui Wang and Lei Dou
Sensors 2025, 25(18), 5743; https://doi.org/10.3390/s25185743 - 15 Sep 2025
Viewed by 387
Abstract
Siamese network trackers are a prominent paradigm in visual object tracking due to efficient similarity learning. However, most Siamese trackers are restricted to the bounding box tracking format, which often fails to accurately describe the appearance of non-rigid targets with complex deformations. Additionally, [...] Read more.
Siamese network trackers are a prominent paradigm in visual object tracking due to efficient similarity learning. However, most Siamese trackers are restricted to the bounding box tracking format, which often fails to accurately describe the appearance of non-rigid targets with complex deformations. Additionally, since the bounding box frequently includes excessive background pixels, trackers are sensitive to similar distractors. To address these issues, we propose a novel segmentation-assisted model that learns binary mask representations of targets. This model is generic and can be seamlessly integrated into various Siamese frameworks, enabling pixel-wise segmentation tracking instead of the suboptimal bounding box tracking. Specifically, our model features two core components: (i) a multi-stage precise mask representation module composed of cascaded U-Net decoders, designed to predict segmentation masks of targets, and (ii) a saliency localization head based on the Euclidean model, which extracts spatial position constraints to boost the decoder’s discriminative capability. Extensive experiments on five tracking benchmarks demonstrate that our method effectively improves the performance of both anchor-based and anchor-free Siamese trackers. Notably, on GOT-10k, our method increases the AO scores of the baseline trackers SiamRPN++ (anchor-based) and SiamBAN (anchor-free) by 5.2% and 7.5%, respectively while maintaining speeds exceeding 60 FPS. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing: 2nd Edition)
Show Figures

Figure 1

27 pages, 1902 KB  
Article
Few-Shot Breast Cancer Diagnosis Using a Siamese Neural Network Framework and Triplet-Based Loss
by Tea Marasović and Vladan Papić
Algorithms 2025, 18(9), 567; https://doi.org/10.3390/a18090567 - 8 Sep 2025
Viewed by 423
Abstract
Breast cancer is one of the leading causes of death among women of all ages and backgrounds globally. In recent years, the growing deficit of expert radiologists—particularly in underdeveloped countries—alongside a surge in the number of images for analysis, has negatively affected the [...] Read more.
Breast cancer is one of the leading causes of death among women of all ages and backgrounds globally. In recent years, the growing deficit of expert radiologists—particularly in underdeveloped countries—alongside a surge in the number of images for analysis, has negatively affected the ability to secure timely and precise diagnostic results in breast cancer screening. AI technologies offer powerful tools that allow for the effective diagnosis and survival forecasting, reducing the dependency on human cognitive input. Towards this aim, this research introduces a deep meta-learning framework for swift analysis of mammography images—combining a Siamese network model with a triplet-based loss function—to facilitate automatic screening (recognition) of potentially suspicious breast cancer cases. Three pre-trained deep CNN architectures, namely GoogLeNet, ResNet50, and MobileNetV3, are fine-tuned and scrutinized for their effectiveness in transforming input mammograms to a suitable embedding space. The proposed framework undergoes a comprehensive evaluation through a rigorous series of experiments, utilizing two different, publicly accessible, and widely used datasets of digital X-ray mammograms: INbreast and CBIS-DDSM. The experimental results demonstrate the framework’s strong performance in differentiating between tumorous and normal images, even with a very limited number of training samples, on both datasets. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))
Show Figures

Figure 1

18 pages, 3709 KB  
Article
AI-Based Response Classification After Anti-VEGF Loading in Neovascular Age-Related Macular Degeneration
by Murat Fırat, İlknur Tuncer Fırat, Ziynet Fadıllıoğlu Üstündağ, Emrah Öztürk and Taner Tuncer
Diagnostics 2025, 15(17), 2253; https://doi.org/10.3390/diagnostics15172253 - 5 Sep 2025
Viewed by 583
Abstract
Background/Objectives: Wet age-related macular degeneration (AMD) is a progressive retinal disease characterized by macular neovascularization (MNV). Currently, the standard treatment for wet AMD is intravitreal anti-VEGF administration, which aims to control disease activity by suppressing neovascularization. In clinical practice, the decision to [...] Read more.
Background/Objectives: Wet age-related macular degeneration (AMD) is a progressive retinal disease characterized by macular neovascularization (MNV). Currently, the standard treatment for wet AMD is intravitreal anti-VEGF administration, which aims to control disease activity by suppressing neovascularization. In clinical practice, the decision to continue or discontinue treatment is largely based on the presence of fluid on optical coherence tomography (OCT) and changes in visual acuity. However, discrepancies between anatomic and functional responses can occur during these assessments. Methods: This article presents an artificial intelligence (AI)-based classification model developed to objectively assess the response to anti-VEGF treatment in patients with AMD at 3 months. This retrospective study included 120 patients (144 eyes) who received intravitreal bevacizumab treatment. After bevacizumab loading treatment, the presence of subretinal/intraretinal fluid (SRF/IRF) on OCT images and changes in visual acuity (logMAR) were evaluated. Patients were divided into three groups: Class 0, active disease (persistent SRF/IRF); Class 1, good response (no SRF/IRF and ≥0.1 logMAR improvement); and Class 2, limited response (no SRF/IRF but with <0.1 logMAR improvement). Pre-treatment and 3-month post-treatment OCT image pairs were used for training and testing the artificial intelligence model. Based on this grouping, classification was performed with a Siamese neural network (ResNet-18-based) model. Results: The model achieved 95.4% accuracy. The macro precision, macro recall, and macro F1 scores for the classes were 0.948, 0.949, and 0.948, respectively. Layer Class Activation Map (LayerCAM) heat maps and Shapley Additive Explanations (SHAP) overlays confirmed that the model focused on pathology-related regions. Conclusions: In conclusion, the model classifies post-loading response by predicting both anatomic disease activity and visual prognosis from OCT images. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

22 pages, 5535 KB  
Article
OFNet: Integrating Deep Optical Flow and Bi-Domain Attention for Enhanced Change Detection
by Liwen Zhang, Quan Zou, Guoqing Li, Wenyang Yu, Yong Yang and Heng Zhang
Remote Sens. 2025, 17(17), 2949; https://doi.org/10.3390/rs17172949 - 25 Aug 2025
Viewed by 603
Abstract
Change detection technology holds significant importance in disciplines such as urban planning, land utilization tracking, and hazard evaluation, as it can efficiently and accurately reveal dynamic regional change processes, providing crucial support for scientific decision-making and refined management. Although deep learning methods based [...] Read more.
Change detection technology holds significant importance in disciplines such as urban planning, land utilization tracking, and hazard evaluation, as it can efficiently and accurately reveal dynamic regional change processes, providing crucial support for scientific decision-making and refined management. Although deep learning methods based on computer vision have achieved remarkable progress in change detection, they still face challenges including reducing dynamic background interference, capturing subtle changes, and effectively fusing multi-temporal data features. To address these issues, this paper proposes a novel change detection model called OFNet. Building upon existing Siamese network architectures, we introduce an optical flow branch module that supplements pixel-level dynamic information. By incorporating motion features to guide the network’s attention to potential change regions, we enhance the model’s ability to characterize and discriminate genuine changes in cross-temporal remote sensing images. Additionally, we innovatively propose a dual-domain attention mechanism that simultaneously models discriminative features in both spatial and frequency domains for change detection tasks. The spatial attention focuses on capturing edge and structural changes, while the frequency-domain attention strengthens responses to key frequency components. The synergistic fusion of these two attention mechanisms effectively improves the model’s sensitivity to detailed changes and enhances the overall robustness of detection. Experimental results demonstrate that OFNet achieves an IoU of 83.03 on the LEVIR-CD dataset and 82.86 on the WHU-CD dataset, outperforming current mainstream approaches and validating its superior detection performance and generalization capability. This presents a novel technical method for environmental observation and urban transformation analysis tasks. Full article
(This article belongs to the Special Issue Advances in Remote Sensing Image Target Detection and Recognition)
Show Figures

Figure 1

25 pages, 3109 KB  
Article
Radio Frequency Fingerprinting Authentication for IoT Networks Using Siamese Networks
by Raju Dhakal, Laxima Niure Kandel and Prashant Shekhar
IoT 2025, 6(3), 47; https://doi.org/10.3390/iot6030047 - 22 Aug 2025
Viewed by 914
Abstract
As IoT (internet of things) devices grow in prominence, safeguarding them from cyberattacks is becoming a pressing challenge. To bootstrap IoT security, device identification or authentication is crucial for establishing trusted connections among devices without prior trust. In this regard, radio frequency fingerprinting [...] Read more.
As IoT (internet of things) devices grow in prominence, safeguarding them from cyberattacks is becoming a pressing challenge. To bootstrap IoT security, device identification or authentication is crucial for establishing trusted connections among devices without prior trust. In this regard, radio frequency fingerprinting (RFF) is gaining attention because it is more efficient and requires fewer computational resources compared to resource-intensive cryptographic methods, such as digital signatures. RFF works by identifying unique manufacturing defects in the radio circuitry of IoT devices by analyzing over-the-air signals that embed these imperfections, allowing for the identification of the transmitting hardware. Recent studies on RFF often leverage advanced classification models, including classical machine learning techniques such as K-Nearest Neighbor (KNN) and Support Vector Machine (SVM), as well as modern deep learning architectures like Convolutional Neural Network (CNN). In particular, CNNs are well-suited as they use multidimensional mapping to detect and extract reliable fingerprints during the learning process. However, a significant limitation of these approaches is that they require large datasets and necessitate retraining when new devices not included in the initial training set are added. This retraining can cause service interruptions and is costly, especially in large-scale IoT networks. In this paper, we propose a novel solution to this problem: RFF using Siamese networks, which eliminates the need for retraining and allows for seamless authentication in IoT deployments. The proposed Siamese network is trained using in-phase and quadrature (I/Q) samples from 10 different Software-Defined Radios (SDRs). Additionally, we present a new algorithm, the Similarity-Based Embedding Classification (SBEC) for RFF. We present experimental results that demonstrate that the Siamese network effectively distinguishes between malicious and trusted devices with a remarkable 98% identification accuracy. Full article
(This article belongs to the Special Issue Cybersecurity in the Age of the Internet of Things)
Show Figures

Figure 1

23 pages, 3781 KB  
Article
Evaluating Urban Visual Attractiveness Perception Using Multimodal Large Language Model and Street View Images
by Qianyu Zhou, Jiaxin Zhang and Zehong Zhu
Buildings 2025, 15(16), 2970; https://doi.org/10.3390/buildings15162970 - 21 Aug 2025
Cited by 1 | Viewed by 831
Abstract
Visual attractiveness perception—an individual’s capacity to recognise and evaluate the visual appeal of urban scene safety—has direct implications for well-being, economic vitality, and social cohesion. However, most empirical studies rely on single-source metrics or algorithm-centric pipelines that under-represent human perception. Addressing this gap, [...] Read more.
Visual attractiveness perception—an individual’s capacity to recognise and evaluate the visual appeal of urban scene safety—has direct implications for well-being, economic vitality, and social cohesion. However, most empirical studies rely on single-source metrics or algorithm-centric pipelines that under-represent human perception. Addressing this gap, we introduce a fully reproducible, multimodal framework that measures and models this domain-specific facet of human intelligence by coupling Generative Pre-trained Transformer 4o (GPT-4o) with 1000 Street View images. The pipeline first elicits pairwise aesthetic judgements from GPT-4o, converts them into a latent attractiveness scale via Thurstone’s law of comparative judgement, and then validates the scale against 1.17 M crowdsourced ratings from MIT’s Place Pulse 2.0 benchmark (Spearman ρ = 0.76, p < 0.001). Compared with a Siamese CNN baseline (ρ = 0.60), GPT-4o yields both higher criterion validity and an 88% reduction in inference time, underscoring its superior capacity to approximate human evaluative reasoning. In this study, we introduce a standardised and reproducible streetscape evaluation pipeline using GPT-4o. We then combine the resulting attractiveness scores with network-based accessibility modelling to generate a “aesthetic–accessibility map” of urban central districts in Chongqing, China. Cluster analysis reveals four statistically distinct street types—Iconic Core, Liveable Rings, Transit-Rich but Bland, and Peripheral Low-Appeal—providing actionable insights for landscape design, urban governance, and tourism planning. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

30 pages, 21184 KB  
Article
FSTC-DiMP: Advanced Feature Processing and Spatio-Temporal Consistency for Anti-UAV Tracking
by Desen Bu, Bing Ding, Xiaozhong Tong, Bei Sun, Xiaoyong Sun, Runze Guo and Shaojing Su
Remote Sens. 2025, 17(16), 2902; https://doi.org/10.3390/rs17162902 - 20 Aug 2025
Viewed by 748
Abstract
The widespread application of UAV technology has brought significant security concerns that cannot be ignored, driving considerable attention to anti-unmanned aerial vehicle (UAV) tracking technologies. Anti-UAV tracking faces challenges, including target entry into and exit from the field of view, thermal crossover, and [...] Read more.
The widespread application of UAV technology has brought significant security concerns that cannot be ignored, driving considerable attention to anti-unmanned aerial vehicle (UAV) tracking technologies. Anti-UAV tracking faces challenges, including target entry into and exit from the field of view, thermal crossover, and interference from similar objects, where Siamese network trackers exhibit notable limitations in anti-UAV tracking. To address these issues, we propose FSTC-DiMP, an anti-UAV tracking algorithm. To better handle feature extraction in low-Signal-to-Clutter-Ratio (SCR) images and expand receptive fields, we introduce the Large Selective Kernel (LSK) attention mechanism, achieving a balance between local feature focus and global information integration. A spatio-temporal consistency-guided re-detection mechanism is designed to mitigate tracking failures caused by target entry into and exit from the field of view or similar-object interference through spatio-temporal relationship analysis. Additionally, a background augmentation module has been developed to more efficiently utilise initial frame information, effectively capturing the semantic features of both targets and their surrounding environments. Experimental results on the AntiUAV410 and AntiUAV600 datasets demonstrate that FSTC-DiMP achieves significant performance improvements in anti-UAV tracking tasks, validating the algorithm’s strong robustness and adaptability to complex environments. Full article
(This article belongs to the Special Issue Recent Advances in Infrared Target Detection)
Show Figures

Graphical abstract

27 pages, 4648 KB  
Article
Day-Ahead Photovoltaic Power Forecasting Based on SN-Transformer-BiMixer
by Xiaohong Huang, Xiuzhen Ding, Yating Han, Qi Sima, Xiaokang Li and Yukun Bao
Energies 2025, 18(16), 4406; https://doi.org/10.3390/en18164406 - 19 Aug 2025
Viewed by 620
Abstract
Accurate forecasting of photovoltaic (PV) power is crucial for ensuring the safe and stable operation of power systems. However, the practical implementation of forecasting systems often faces challenges due to missing real-time historical power data, typically caused by sensor malfunctions or communication failures, [...] Read more.
Accurate forecasting of photovoltaic (PV) power is crucial for ensuring the safe and stable operation of power systems. However, the practical implementation of forecasting systems often faces challenges due to missing real-time historical power data, typically caused by sensor malfunctions or communication failures, which substantially hamper the performance of existing data-driven time-series forecasting techniques. To address these limitations, this study proposes a novel day-ahead PV forecasting approach based on similar-day analysis, i.e., SN-Transformer-BiMixer. Specifically, a Siamese network (SN) is employed to identify patterns analogous to the target day within a historical power dataset accumulated over an extended period, considering its superior ability to extract discriminative features and quantify similarities. By identifying similar historical days from multiple time scales using SN, a baseline generation pattern for the target day is established to allow forecasting without relying on real-time measurement data. Subsequently, a transformer model is used to refine these similar temporal curves, yielding improved multi-scale forecasting outputs. Finally, a bidirectional mixer (BiMixer) module is designed to synthesize similar curves across multiple scales, thereby providing more accurate forecast results. Experimental results demonstrate the superiority of the proposed model over existing approaches. Compared to Informer, SN-Transformer-BiMixer achieves an 11.32% reduction in root mean square error (RMSE). Moreover, the model exhibits strong robustness to missing data, outperforming the vanilla Transformer by 8.99% in RMSE. Full article
(This article belongs to the Special Issue New Progress in Electricity Demand Forecasting)
Show Figures

Figure 1

21 pages, 1212 KB  
Article
A Semi-Supervised Approach to Characterise Microseismic Landslide Events from Big Noisy Data
by David Murray, Lina Stankovic and Vladimir Stankovic
Geosciences 2025, 15(8), 304; https://doi.org/10.3390/geosciences15080304 - 6 Aug 2025
Viewed by 612
Abstract
Most public seismic recordings, sampled at hundreds of Hz, tend to be unlabelled, i.e., not catalogued, mainly because of the sheer volume of samples and the amount of time needed by experts to confidently label detected events. This is especially challenging for very [...] Read more.
Most public seismic recordings, sampled at hundreds of Hz, tend to be unlabelled, i.e., not catalogued, mainly because of the sheer volume of samples and the amount of time needed by experts to confidently label detected events. This is especially challenging for very low signal-to-noise ratio microseismic events that characterise landslides during rock and soil mass displacement. Whilst numerous supervised machine learning models have been proposed to classify landslide events, they rely on a large amount of labelled datasets. Therefore, there is an urgent need to develop tools to effectively automate the data-labelling process from a small set of labelled samples. In this paper, we propose a semi-supervised method for labelling of signals recorded by seismometers that can reduce the time and expertise needed to create fully annotated datasets. The proposed Siamese network approach learns best class-exemplar anchors, leveraging learned similarity between these anchor embeddings and unlabelled signals. Classification is performed via soft-labelling and thresholding instead of hard class boundaries. Furthermore, network output explainability is used to explain misclassifications and we demonstrate the effect of anchors on performance, via ablation studies. The proposed approach classifies four landslide classes, namely earthquakes, micro-quakes, rockfall and anthropogenic noise, demonstrating good agreement with manually detected events while requiring few training data to be effective, hence reducing the time needed for labelling and updating models. Full article
Show Figures

Figure 1

22 pages, 4258 KB  
Article
A Few-Shot SE-Relation Net-Based Electronic Nose for Discriminating COPD
by Zhuoheng Xie, Yao Tian and Pengfei Jia
Sensors 2025, 25(15), 4780; https://doi.org/10.3390/s25154780 - 3 Aug 2025
Viewed by 543
Abstract
We propose an advanced electronic nose based on SE-RelationNet for COPD diagnosis with limited breath samples. The model integrates residual blocks, BiGRU layers, and squeeze–excitation attention mechanisms to enhance feature-extraction efficiency. Experimental results demonstrate exceptional performance with minimal samples: in 4-way 1-shot tasks, [...] Read more.
We propose an advanced electronic nose based on SE-RelationNet for COPD diagnosis with limited breath samples. The model integrates residual blocks, BiGRU layers, and squeeze–excitation attention mechanisms to enhance feature-extraction efficiency. Experimental results demonstrate exceptional performance with minimal samples: in 4-way 1-shot tasks, the model achieves 85.8% mean accuracy (F1-score = 0.852), scaling to 93.3% accuracy (F1-score = 0.931) with four samples per class. Ablation studies confirm that the 5-layer residual structure and single-hidden-layer BiGRU optimize stability (h_F1-score ≤ 0.011). Compared to SiameseNet and ProtoNet, SE-RelationNet shows superior accuracy (>15% improvement in 1-shot tasks). This technology enables COPD detection with as few as one breath sample, facilitating early intervention to mitigate lung cancer risks in COPD patients. Full article
(This article belongs to the Special Issue Nature Inspired Engineering: Biomimetic Sensors (2nd Edition))
Show Figures

Figure 1

Back to TopTop