MDPI - Publisher of Open Access Journals

24 pages, 4538 KB

Open AccessArticle

CNN–Transformer-Based Model for Maritime Blurred Target Recognition

by Tianyu Huang, Chao Pan, Jin Liu and Zhiwei Kang

Electronics 2025, 14(17), 3354; https://doi.org/10.3390/electronics14173354 - 23 Aug 2025

Viewed by 328

In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This [...] Read more.

In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This paper proposes a dual-branch recognition method specifically designed for motion blur, which represents the most prevalent blur type in maritime scenarios. Conventional approaches exhibit constrained computational efficiency and limited adaptability across different modalities. To overcome these limitations, we propose a hybrid CNN–Transformer architecture: the CNN branch captures local blur characteristics, while the enhanced Transformer module models long-range dependencies via attention mechanisms. The CNN branch employs a lightweight ResNet variant, in which conventional residual blocks are substituted with Multi-Scale Gradient-Aware Residual Block (MSG-ARB). This architecture employs learnable gradient convolution for explicit local gradient feature extraction and utilizes gradient content gating to strengthen blur-sensitive region representation, significantly improving computational efficiency compared to conventional CNNs. The Transformer branch incorporates a Hierarchical Swin Transformer (HST) framework with Shifted Window-based Multi-head Self-Attention for global context modeling. The proposed method incorporates blur invariant Positional Encoding (PE) to enhance blur spectrum modeling capability, while employing DyT (Dynamic Tanh) module with learnable α parameters to replace traditional normalization layers. This architecture achieves a significant reduction in computational costs while preserving feature representation quality. Moreover, it efficiently computes long-range image dependencies using a compact 16 × 16 window configuration. The proposed feature fusion module synergistically integrates CNN-based local feature extraction with Transformer-enabled global representation learning, achieving comprehensive feature modeling across different scales. To evaluate the model’s performance and generalization ability, we conducted comprehensive experiments on four benchmark datasets: VAIS, GoPro, Mini-ImageNet, and Open Images V4. Experimental results show that our method achieves superior classification accuracy compared to state-of-the-art approaches, while simultaneously enhancing inference speed and reducing GPU memory consumption. Ablation studies confirm that the DyT module effectively suppresses outliers and improves computational efficiency, particularly when processing low-quality input data. Full article

► Show Figures

Figure 1

17 pages, 3374 KB

Open AccessTechnical Note

A Novel Real-Time Multi-Channel Error Calibration Architecture for DBF-SAR

by Jinsong Qiu, Zhimin Zhang, Yunkai Deng, Heng Zhang, Wei Wang, Zhen Chen, Sixi Hou, Yihang Feng and Nan Wang

Remote Sens. 2025, 17(16), 2890; https://doi.org/10.3390/rs17162890 - 19 Aug 2025

Viewed by 524

Abstract

Digital Beamforming SAR (DBF-SAR) provides high-resolution wide-swath imaging capability, yet it is affected by inter-channel amplitude, phase and time-delay errors induced by temperature variations and random error factors. Since all elevation channel data are weighted and summed by the DBF module in real [...] Read more.

Digital Beamforming SAR (DBF-SAR) provides high-resolution wide-swath imaging capability, yet it is affected by inter-channel amplitude, phase and time-delay errors induced by temperature variations and random error factors. Since all elevation channel data are weighted and summed by the DBF module in real time, conventional record-then-compensate approaches cannot meet real-time processing requirements. To resolve the problem, a real-time calibration architecture for Intermediate Frequency DBF (IFDBF) is presented in this paper. The Field-Programmable Gate Array (FPGA) implementation estimates amplitude errors through simple summation, time-delay errors via a simple counter, and phase errors via single-bin Discrete-Time Fourier Transform (DTFT). The time-delay and phase error information are converted into single-tone frequency components through Dechirp processing. The proposed method deliberately employs a reduced-length DTFT implementation to achieve enhanced delay estimation range adaptability. The method completes calibration within tens of PRIs (under 1 s). The proposed method is analyzed and validated through a spaceborne simulation and X-band 16-channel DBF-SAR experiments. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

24 pages, 3961 KB

Open AccessArticle

Hierarchical Multi-Scale Mamba with Tubular Structure-Aware Convolution for Retinal Vessel Segmentation

by Tao Wang, Dongyuan Tian, Haonan Zhao, Jiamin Liu, Weijie Wang, Chunpei Li and Guixia Liu

Entropy 2025, 27(8), 862; https://doi.org/10.3390/e27080862 - 14 Aug 2025

Viewed by 462

Abstract

Retinal vessel segmentation plays a crucial role in diagnosing various retinal and cardiovascular diseases and serves as a foundation for computer-aided diagnostic systems. Blood vessels in color retinal fundus images, captured using fundus cameras, are often affected by illumination variations and noise, making [...] Read more.

Retinal vessel segmentation plays a crucial role in diagnosing various retinal and cardiovascular diseases and serves as a foundation for computer-aided diagnostic systems. Blood vessels in color retinal fundus images, captured using fundus cameras, are often affected by illumination variations and noise, making it difficult to preserve vascular integrity and posing a significant challenge for vessel segmentation. In this paper, we propose HM-Mamba, a novel hierarchical multi-scale Mamba-based architecture that incorporates tubular structure-aware convolution to extract both local and global vascular features for retinal vessel segmentation. First, we introduce a tubular structure-aware convolution to reinforce vessel continuity and integrity. Building on this, we design a multi-scale fusion module that aggregates features across varying receptive fields, enhancing the model’s robustness in representing both primary trunks and fine branches. Second, we integrate multi-branch Fourier transform with the dynamic state modeling capability of Mamba to capture both long-range dependencies and multi-frequency information. This design enables robust feature representation and adaptive fusion, thereby enhancing the network’s ability to model complex spatial patterns. Furthermore, we propose a hierarchical multi-scale interactive Mamba block that integrates multi-level encoder features through gated Mamba-based global context modeling and residual connections, enabling effective multi-scale semantic fusion and reducing detail loss during downsampling. Extensive evaluations on five widely used benchmark datasets—DRIVE, CHASE_DB1, STARE, IOSTAR, and LES-AV—demonstrate the superior performance of HM-Mamba, yielding Dice coefficients of 0.8327, 0.8197, 0.8239, 0.8307, and 0.8426, respectively. Full article

(This article belongs to the Special Issue Methods in Artificial Intelligence and Information Processing, Third Edition)

► Show Figures

Figure 1

27 pages, 4225 KB

Open AccessArticle

Data Sampling System for Processing Event Camera Data Using a Stochastic Neural Network on an FPGA

by Seth Shively, Nathaniel Jackson, Eugene Chabot, John DiCecco and Scott Koziol

Electronics 2025, 14(15), 3094; https://doi.org/10.3390/electronics14153094 - 2 Aug 2025

Viewed by 512

Abstract

The use of a stochastic artificial neural network (SANN) implemented on a Field Programmable Gate Array (FPGA) provides a promising method of performing image recognition on event camera recordings, however, challenges exist due to the fact that event camera data has an inherent [...] Read more.

The use of a stochastic artificial neural network (SANN) implemented on a Field Programmable Gate Array (FPGA) provides a promising method of performing image recognition on event camera recordings, however, challenges exist due to the fact that event camera data has an inherent unevenness in the timing at which data is sent out of the camera. This paper proposes a sampling system to overcome this challenge, by which all “events” occurring at specific timestamps in an event camera recording are selected (sampled) to be processed and sent to the SANN at regular intervals. This system is implemented on an FPGA in SystemVerilog, and to test it, simulated event camera data is sent to the system from a computer running MATLAB (version 2022+). The sampling system is shown to be functional. Analysis is shown demonstrating its performance regarding data sparsity, time convergence, normalization, repeatability, range, and some characteristics of the hold system. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

21 pages, 4490 KB

Open AccessFeature PaperArticle

DFANet: A Deep Feature Attention Network for Building Change Detection in Remote Sensing Imagery

by Peigeng Lu, Haiyong Ding and Xiang Tian

Remote Sens. 2025, 17(15), 2575; https://doi.org/10.3390/rs17152575 - 24 Jul 2025

Cited by 1 | Viewed by 486

Abstract

Change detection (CD) in remote sensing (RS) is a fundamental task that seeks to identify changes in land cover by analyzing bitemporal images. In recent years, deep learning (DL)-based approaches have demonstrated remarkable success in a wide range of CD applications. However, most [...] Read more.

Change detection (CD) in remote sensing (RS) is a fundamental task that seeks to identify changes in land cover by analyzing bitemporal images. In recent years, deep learning (DL)-based approaches have demonstrated remarkable success in a wide range of CD applications. However, most existing methods have limitations in detecting building edges and addressing pseudo-changes, and lack the ability to model feature context. In this paper, we introduce DFANet—a Deep Feature Attention Network specifically designed for building CD in RS imagery. First, we devise a spatial-channel attention module to strengthen the network’s capacity to extract change cues from bitemporal feature maps and reduce the occurrence of pseudo-changes. Second, we introduce a GatedConv module to improve the network’s capability for building edge detection. Finally, Transformer is introduced to capture long-range dependencies across bitemporal images, enabling the network to better understand feature change patterns and the relationships between different regions and land cover categories. We carried out comprehensive experiments on two publicly available building CD datasets—LEVIR-CD and WHU-CD. The results demonstrate that DFANet achieves exceptional performance in evaluation metrics such as precision, F1 score, and IoU, consistently outperforming existing state-of-the-art approaches. Full article

► Show Figures

Figure 1

23 pages, 7457 KB

Open AccessArticle

An Efficient Ship Target Integrated Imaging and Detection Framework (ST-IIDF) for Space-Borne SAR Echo Data

by Can Su, Wei Yang, Yongchen Pan, Hongcheng Zeng, Yamin Wang, Jie Chen, Zhixiang Huang, Wei Xiong, Jie Chen and Chunsheng Li

Remote Sens. 2025, 17(15), 2545; https://doi.org/10.3390/rs17152545 - 22 Jul 2025

Viewed by 449

Abstract

Due to the sparse distribution of ship targets in wide-area offshore scenarios, the typical cascade mode of imaging and detection for space-borne Synthetic Aperture Radar (SAR) echo data would consume substantial computational time and resources, severely affecting the timeliness of ship target information [...] Read more.

Due to the sparse distribution of ship targets in wide-area offshore scenarios, the typical cascade mode of imaging and detection for space-borne Synthetic Aperture Radar (SAR) echo data would consume substantial computational time and resources, severely affecting the timeliness of ship target information acquisition tasks. Therefore, we propose a ship target integrated imaging and detection framework (ST-IIDF) for SAR oceanic region data. A two-step filtering structure is added in the SAR imaging process to extract the potential areas of ship targets, which can accelerate the whole process. First, an improved peak-valley detection method based on one-dimensional scattering characteristics is used to locate the range gate units for ship targets. Second, a dynamic quantization method is applied to the imaged range gate units to further determine the azimuth region. Finally, a lightweight YOLO neural network is used to eliminate false alarm areas and obtain accurate positions of the ship targets. Through experiments on Hisea-1 and Pujiang-2 data, within sparse target scenes, the framework maintains over 90% accuracy in ship target detection, with an average processing speed increase of 35.95 times. The framework can be applied to ship target detection tasks with high timeliness requirements and provides an effective solution for real-time onboard processing. Full article

(This article belongs to the Special Issue Efficient Object Detection Based on Remote Sensing Images)

► Show Figures

Figure 1

28 pages, 19790 KB

Open AccessArticle

HSF-DETR: A Special Vehicle Detection Algorithm Based on Hypergraph Spatial Features and Bipolar Attention

by Kaipeng Wang, Guanglin He and Xinmin Li

Sensors 2025, 25(14), 4381; https://doi.org/10.3390/s25144381 - 13 Jul 2025

Viewed by 607

Abstract

Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature [...] Read more.

Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature Network (CSFNet) backbone with Cross-Efficient Convolutional Gating (CECG) for enhanced long-range detection through hybrid state-space modeling; a Hypergraph-Enhanced Spatial Feature Modulation (HyperSFM) network utilizing hypergraph structures for high-order feature correlations and adaptive multi-scale fusion; a Dual-Domain Feature Encoder (DDFE) combining Bipolar Efficient Attention (BEA) and Frequency-Enhanced Feed-Forward Network (FEFFN) for precise feature weight allocation; and a Spatial-Channel Fusion Upsampling Block (SCFUB) improving feature fidelity through depth-wise separable convolution and channel shift mixing. Experiments conducted on a self-built special vehicle dataset containing 2388 images demonstrate that HSF-DETR achieves mAP50 and mAP50-95 of 96.6% and 70.6%, respectively, representing improvements of 3.1% and 4.6% over baseline RT-DETR while maintaining computational efficiency at 59.7 GFLOPs and 18.07 M parameters. Cross-domain validation on VisDrone2019 and BDD100K datasets confirms the method’s generalization capability and robustness across diverse scenarios, establishing HSF-DETR as an effective solution for special vehicle detection in complex environments. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 19550 KB

Open AccessArticle

TMTS: A Physics-Based Turbulence Mitigation Network Guided by Turbulence Signatures for Satellite Video

by Jie Yin, Tao Sun, Xiao Zhang, Guorong Zhang, Xue Wan and Jianjun He

Remote Sens. 2025, 17(14), 2422; https://doi.org/10.3390/rs17142422 - 12 Jul 2025

Viewed by 436

Abstract

Atmospheric turbulence severely degrades high-resolution satellite videos through spatiotemporally coupled distortions, including temporal jitter, spatial-variant blur, deformation, and scintillation, thereby constraining downstream analytical capabilities. Restoring turbulence-corrupted videos poses a challenging ill-posed inverse problem due to the inherent randomness of turbulent fluctuations. While existing [...] Read more.

Atmospheric turbulence severely degrades high-resolution satellite videos through spatiotemporally coupled distortions, including temporal jitter, spatial-variant blur, deformation, and scintillation, thereby constraining downstream analytical capabilities. Restoring turbulence-corrupted videos poses a challenging ill-posed inverse problem due to the inherent randomness of turbulent fluctuations. While existing turbulence mitigation methods for long-range imaging demonstrate partial success, they exhibit limited generalizability and interpretability in large-scale satellite scenarios. Inspired by refractive-index structure constant (

C_{n}^{2}

) estimation from degraded sequences, we propose a physics-informed turbulence signature (TS) prior that explicitly captures spatiotemporal distortion patterns to enhance model transparency. Integrating this prior into a lucky imaging framework, we develop a Physics-Based Turbulence Mitigation Network guided by Turbulence Signature (TMTS) to disentangle atmospheric disturbances from satellite videos. The framework employs deformable attention modules guided by turbulence signatures to correct geometric distortions, iterative gated mechanisms for temporal alignment stability, and adaptive multi-frame aggregation to address spatially varying blur. Comprehensive experiments on synthetic and real-world turbulence-degraded satellite videos demonstrate TMTS’s superiority, achieving 0.27 dB PSNR and 0.0015 SSIM improvements over the DATUM baseline while maintaining practical computational efficiency. By bridging turbulence physics with deep learning, our approach provides both performance enhancements and interpretable restoration mechanisms, offering a viable solution for operational satellite video processing under atmospheric disturbances. Full article

(This article belongs to the Special Issue Intelligent Image Analysis: Advancing Remote Sensing with Artificial Intelligence)

► Show Figures

Graphical abstract

20 pages, 90560 KB

Open AccessArticle

A Hybrid MIL Approach Leveraging Convolution and State-Space Model for Whole-Slide Image Cancer Subtyping

by Dehui Bi and Yuqi Zhang

Mathematics 2025, 13(13), 2178; https://doi.org/10.3390/math13132178 - 3 Jul 2025

Viewed by 429

Abstract

Precise identification of cancer subtypes from whole slide images (WSIs) is pivotal in tailoring patient-specific therapies. Under the weakly supervised multiple instance learning (MIL) paradigm, existing techniques frequently fall short in simultaneously capturing local tissue textures and long-range contextual relationships. To address these [...] Read more.

Precise identification of cancer subtypes from whole slide images (WSIs) is pivotal in tailoring patient-specific therapies. Under the weakly supervised multiple instance learning (MIL) paradigm, existing techniques frequently fall short in simultaneously capturing local tissue textures and long-range contextual relationships. To address these challenges, we introduce ConvMixerSSM, a hybrid model that integrates a ConvMixer block for local spatial representation, a state space model (SSM) block for capturing long-range dependencies, and a feature-gated block to enhance informative feature selection. The model was evaluated on the TCGA-NSCLC dataset and the CAMELYON16 dataset for cancer subtyping tasks. Extensive experiments, including comparisons with state-of-the-art MIL methods and ablation studies, were conducted to assess the contribution of each component. ConvMixerSSM achieved an AUC of 97.83%, an ACC of 91.82%, and an F1 score of 91.18%, outperforming existing MIL baselines on the TCGA-NSCLC dataset. The ablation study revealed that each block contributed positively to performance, with the full model showing the most balanced and superior results. Moreover, our visualization results further confirm that ConvMixerSSM can effectively identify tumor regions within WSIs, providing model interpretability and clinical relevance. These findings suggest that ConvMixerSSM has strong potential for advancing computational pathology applications in clinical decision-making. Full article

(This article belongs to the Special Issue Computational Perspectives on Artificial Intelligence Drive in Medical Decision-Making)

► Show Figures

Figure 1

16 pages, 2521 KB

Open AccessArticle

A Multimodal CMOS Readout IC for SWIR Image Sensors with Dual-Mode BDI/DI Pixels and Column-Parallel Two-Step Single-Slope ADC

by Yuyan Zhang, Zhifeng Chen, Yaguang Yang, Huangwei Chen, Jie Gao, Zhichao Zhang and Chengying Chen

Micromachines 2025, 16(7), 773; https://doi.org/10.3390/mi16070773 - 30 Jun 2025

Viewed by 684

Abstract

This paper proposes a dual-mode CMOS analog front-end (AFE) circuit for short-wave infrared (SWIR) image sensors, which integrates a hybrid readout circuit (ROIC) and a 12-bit two-step single-slope analog-to-digital converter (TS-SS ADC). The ROIC dynamically switches between buffered-direct-injection (BDI) and direct-injection (DI) modes, [...] Read more.

This paper proposes a dual-mode CMOS analog front-end (AFE) circuit for short-wave infrared (SWIR) image sensors, which integrates a hybrid readout circuit (ROIC) and a 12-bit two-step single-slope analog-to-digital converter (TS-SS ADC). The ROIC dynamically switches between buffered-direct-injection (BDI) and direct-injection (DI) modes, thus balancing injection efficiency against power consumption. While the DI structure offers simplicity and low power, it suffers from unstable biasing and reduced injection efficiency under high background currents. Conversely, the BDI structure enhances injection efficiency and bias stability via an input buffer but incurs higher power consumption. To address this trade-off, a dual-mode injection architecture with mode-switching transistors is implemented. Mode selection is executed in-pixel via a low-leakage transmission gate and coordinated by the column timing controller, enabling low-current pixels to operate in low-noise BDI mode, whereas high-current pixels revert to the low-power DI mode. The TS-SS ADC employs a four-terminal comparator and dynamic reference voltage compensation to mitigate charge leakage and offset, which improves signal-to-noise ratio (SNR) and linearity. The prototype occupies 2.1 mm × 2.88 mm in a 0.18 µm CMOS process and serves a 64 × 64 array. The AFE achieves a dynamic range of 75.58 dB, noise of 249.42 μV, and 81.04 mW power consumption. Full article

(This article belongs to the Special Issue High-Reliability Semiconductor Devices and Integrated Circuits, 3rd Edition)

► Show Figures

Figure 1

24 pages, 2843 KB

Open AccessArticle

Classification of Maize Images Enhanced with Slot Attention Mechanism in Deep Learning Architectures

by Zafer Cömert, Alper Talha Karadeniz, Erdal Basaran and Yuksel Celik

Electronics 2025, 14(13), 2635; https://doi.org/10.3390/electronics14132635 - 30 Jun 2025

Viewed by 438

Abstract

Maize is a vital global crop, serving as a fundamental component of global food security. To support sustainable maize production, the accurate classification of maize seeds—particularly distinguishing haploid from diploid types—is essential for enhancing breeding efficiency. Conventional methods relying on manual inspection or [...] Read more.

Maize is a vital global crop, serving as a fundamental component of global food security. To support sustainable maize production, the accurate classification of maize seeds—particularly distinguishing haploid from diploid types—is essential for enhancing breeding efficiency. Conventional methods relying on manual inspection or simple machine learning are prone to errors and unsuitable for large-scale data. To overcome these limitations, we propose Slot-Maize, a novel deep learning architecture that integrates Convolutional Neural Networks (CNN), Slot Attention, Gated Recurrent Units (GRU), and Long Short-Term Memory (LSTM) layers. The Slot-Maize model was evaluated using two datasets: the Maize Seed Dataset and the Maize Variety Dataset. The Slot Attention module improves feature representation by focusing on object-centric regions within seed images. The GRU captures short-term sequential patterns in extracted features, while the LSTM models long-range dependencies, enhancing temporal understanding. Furthermore, Grad-CAM was utilized as an explainable AI technique to enhance the interpretability of the model’s decisions. The model demonstrated an accuracy of 96.97% on the Maize Seed Dataset and 92.30% on the Maize Variety Dataset, outperforming existing methods in both cases. These results demonstrate the model’s robustness, generalizability, and potential to accelerate automated maize breeding workflows. In conclusion, the Slot-Maize model provides a robust and interpretable solution for automated maize seed classification, representing a significant advancement in agricultural technology. By combining accuracy with explainability, Slot-Maize provides a reliable tool for precision agriculture. Full article

(This article belongs to the Special Issue Data-Related Challenges in Machine Learning: Theory and Application)

► Show Figures

Figure 1

13 pages, 1142 KB

Open AccessArticle

Flash 3D Imaging of Far-Field Dynamic Objects: An EMCCD-Based Polarization Modulation System

by Shengjie Wang, Xiaojia Yang, Donglin Su, Weiqi Cao and Xianhao Zhang

Sensors 2025, 25(13), 3852; https://doi.org/10.3390/s25133852 - 20 Jun 2025

Viewed by 360

Abstract

High-resolution 3D visualization of dynamic environments is critical for applications such as remote sensing. Traditional 3D imaging systems, such as lidar, rely on avalanche photodiode (APD) arrays to determine the flight time of light for each scene pixel. In this context, we introduce [...] Read more.

High-resolution 3D visualization of dynamic environments is critical for applications such as remote sensing. Traditional 3D imaging systems, such as lidar, rely on avalanche photodiode (APD) arrays to determine the flight time of light for each scene pixel. In this context, we introduce and demonstrate a high-resolution 3D imaging approach leveraging an Electron Multiplying Charge Coupled Device (EMCCD). This sensor’s low bandwidth properties allow for the use of electro-optic modulators to achieve both temporal resolution and rapid shuttering at sub-nanosecond speeds. This enables range-gated 3D imaging, which significantly enhances the signal-to-noise ratio (SNR) within our proposed framework. By employing a dual EMCCD setup, it is possible to reconstruct both a depth image and a grayscale image from a single raw data frame, thereby improving dynamic imaging capabilities, irrespective of object or platform movement. Additionally, the adaptive gate-opening range technology can further refine the range resolution of specific scene objects to as low as 10 cm. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 1732 KB

Open AccessArticle

Model-Based Design of Contrast-Limited Histogram Equalization for Low-Complexity, High-Speed, and Low-Power Tone-Mapping Operation

by Wei Dong, Maikon Nascimento and Dileepan Joseph

Electronics 2025, 14(12), 2416; https://doi.org/10.3390/electronics14122416 - 13 Jun 2025

Viewed by 469

Abstract

Imaging applications involving outdoor scenes and fast motion require sensing and processing of high-dynamic-range images at video rates. In turn, image signal processing pipelines that serve low-dynamic-range displays require tone mapping operators (TMOs). For high-speed and low-power applications with low-cost field-programmable gate arrays [...] Read more.

Imaging applications involving outdoor scenes and fast motion require sensing and processing of high-dynamic-range images at video rates. In turn, image signal processing pipelines that serve low-dynamic-range displays require tone mapping operators (TMOs). For high-speed and low-power applications with low-cost field-programmable gate arrays (FPGAs), global TMOs that employ contrast-limited histogram equalization prove ideal. To develop such TMOs, this work proposes a MATLAB–Simulink–Vivado design flow. A realized design capable of megapixel video rates using milliwatts of power requires only a fraction of the resources available in the lowest-cost Artix-7 device from Xilinx (now Advanced Micro Devices). Unlike histogram-based TMO approaches for nonlinear sensors in the literature, this work exploits Simulink modeling to reduce the total required FPGA memory by orders of magnitude with minimal impact on video output. After refactoring an approach from the literature that incorporates two subsystems (Base Histograms and Tone Mapping) to one incorporating four subsystems (Scene Histogram, Perceived Histogram, Tone Function, and Global Mapping), memory is exponentially reduced by introducing a fifth subsystem (Interpolation). As a crucial stepping stone between MATLAB algorithm abstraction and Vivado circuit realization, the Simulink modeling facilitated a bit-true design flow. Full article

(This article belongs to the Special Issue Design of Low-Voltage and Low-Power Integrated Circuits)

► Show Figures

Figure 1

14 pages, 753 KB

Open AccessArticle

A Hybrid Deep Learning-Based Load Forecasting Model for Logical Range

by Hao Chen and Zheng Dang

Appl. Sci. 2025, 15(10), 5628; https://doi.org/10.3390/app15105628 - 18 May 2025

Cited by 1 | Viewed by 426

Abstract

The Logical Range is a mission-oriented, reconfigurable environment that integrates testing, training, and simulation by virtually connecting distributed systems. In such environments, task-processing devices often experience highly dynamic workloads due to varying task demands, leading to scheduling inefficiencies and increased latency. To address [...] Read more.

The Logical Range is a mission-oriented, reconfigurable environment that integrates testing, training, and simulation by virtually connecting distributed systems. In such environments, task-processing devices often experience highly dynamic workloads due to varying task demands, leading to scheduling inefficiencies and increased latency. To address this, we propose GCSG, a hybrid load forecasting model tailored for Logical Range operations. GCSG transforms time-series device load data into image representations using Gramian Angular Field (GAF) encoding, extracts spatial features via a Convolutional Neural Network (CNN) enhanced with a Squeeze-and-Excitation network (SENet), and captures temporal dependencies using a Gated Recurrent Unit (GRU). Through the integration of spatial–temporal features, GCSG enables accurate load forecasting, supporting more efficient resource scheduling. Experiments show that GCSG achieves an

R^{2}

of 0.86, MAE of 4.5, and MSE of 34, outperforming baseline models in terms of both accuracy and generalization. Full article

► Show Figures

Figure 1

28 pages, 2489 KB

Open AccessArticle

A Hybrid Learnable Fusion of ConvNeXt and Swin Transformer for Optimized Image Classification

by Jaber Qezelbash-Chamak and Karen Hicklin

IoT 2025, 6(2), 30; https://doi.org/10.3390/iot6020030 - 16 May 2025

Cited by 1 | Viewed by 2585

Abstract

Medical image classification often relies on CNNs to capture local details (e.g., lesions, nodules) or on transformers to model long-range dependencies. However, each paradigm alone is limited in addressing both fine-grained structures and broader anatomical context. We propose ConvTransGFusion, a hybrid model that [...] Read more.

Medical image classification often relies on CNNs to capture local details (e.g., lesions, nodules) or on transformers to model long-range dependencies. However, each paradigm alone is limited in addressing both fine-grained structures and broader anatomical context. We propose ConvTransGFusion, a hybrid model that fuses ConvNeXt (for refined convolutional features) and Swin Transformer (for hierarchical global attention) using a learnable dual-attention gating mechanism. By aligning spatial dimensions, scaling each branch adaptively, and applying both channel and spatial attention, the proposed architecture bridges local and global representations, melding fine-grained lesion details with the broader anatomical context essential for accurate diagnosis. Tested on four diverse medical imaging datasets—including X-ray, ultrasound, and MRI scans—the proposed model consistently achieves superior accuracy, precision, recall, F1, and AUC over state-of-the-art CNNs and transformers. Our findings highlight the benefits of combining convolutional inductive biases and transformer-based global context in a single learnable framework, positioning ConvTransGFusion as a robust and versatile solution for real-world clinical applications. Full article

► Show Figures

Figure 1

Search Results (151)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (151)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI