Electronics

Research

Jump to: Review

15 pages, 17295 KiB

Open AccessArticle

Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification

by Feng Zhou, Zhuxuan Cheng, Haitao Yang, Yifeng Song and Shengpeng Fu

Electronics 2024, 13(14), 2825; https://doi.org/10.3390/electronics13142825 - 18 Jul 2024

Viewed by 349

Abstract

The visible-infrared person re-identification (VI-ReID) task aims to retrieve the same pedestrian between visible and infrared images. VI-ReID is a challenging task due to the huge modality discrepancy and complex intra-modality variations. Existing works mainly complete the modality alignment at one stage. However, [...] Read more.

The visible-infrared person re-identification (VI-ReID) task aims to retrieve the same pedestrian between visible and infrared images. VI-ReID is a challenging task due to the huge modality discrepancy and complex intra-modality variations. Existing works mainly complete the modality alignment at one stage. However, aligning modalities at different stages has positive effects on the intra-class and inter-class distances of cross-modality features, which are often ignored. Moreover, discriminative features with identity information may be corrupted in the processing of modality alignment, further degrading the performance of person re-identification. In this paper, we propose a progressive discriminative feature learning (PDFL) network that adopts different alignment strategies at different stages to alleviate the discrepancy and learn discriminative features progressively. Specifically, we first design an adaptive cross fusion module (ACFM) to learn the identity-relevant features via modality alignment with channel-level attention. For well preserving identity information, we propose a dual-attention-guided instance normalization module (DINM), which can well guide instance normalization to align two modalities into a unified feature space through channel and spatial information embedding. Finally, we generate multiple part features of a person to mine subtle differences. Multi-loss optimization is imposed during the training process for more effective learning supervision. Extensive experiments on the public datasets of SYSU-MM01 and RegDB validate that our proposed method performs favorably against most state-of-the-art methods. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

28 pages, 67696 KiB

Open AccessArticle

PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network

by Wentao Li, Guang Zhou, Sen Lin and Yandong Tang

Electronics 2024, 13(14), 2817; https://doi.org/10.3390/electronics13142817 - 17 Jul 2024

Viewed by 414

Abstract

The existing image-restoration methods are only effective for specific degradation tasks, but the type of image degradation in practical applications is unknown, and mismatch between the model and the actual degradation will lead to performance decline. Attention mechanisms play an important role in [...] Read more.

The existing image-restoration methods are only effective for specific degradation tasks, but the type of image degradation in practical applications is unknown, and mismatch between the model and the actual degradation will lead to performance decline. Attention mechanisms play an important role in image-restoration tasks; however, it is difficult for existing attention mechanisms to effectively utilize the continuous correlation information of image noise. In order to solve these problems, we propose a Progressive and Efficient All-in-one Image Restoration Lightweight Network (PerNet). The network consists of a Plug-and-Play Efficient Local Attention Module (PPELAM). The PPELAM is composed of multiple Efficient Local Attention Units (ELAUs) and PPELAM can effectively use the global information and horizontal and vertical correlation of image degradation features in space, so as to reduce information loss and have a small number of parameters. PerNet is able to learn the degradation properties of images very well, which allows us to reach an advanced level in image-restoration tasks. Experiments show that PerNet has excellent results for typical restoration tasks (image deraining, image dehazing, image desnowing and underwater image enhancement), and the excellent performance of ELAU combined with Transformer in the ablation experiment chapter further proves the high efficiency of ELAU. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

15 pages, 5604 KiB

Open AccessArticle

Real-Time Deep Learning Framework for Accurate Speed Estimation of Surrounding Vehicles in Autonomous Driving

by Iván García-Aguilar, Jorge García-González, Enrique Domínguez, Ezequiel López-Rubio and Rafael M. Luque-Baena

Electronics 2024, 13(14), 2790; https://doi.org/10.3390/electronics13142790 - 16 Jul 2024

Viewed by 451

Abstract

Accurate speed estimation of surrounding vehicles is of paramount importance for autonomous driving to prevent potential hazards. This paper emphasizes the critical role of precise speed estimation and presents a novel real-time framework based on deep learning to achieve this from images captured [...] Read more.

Accurate speed estimation of surrounding vehicles is of paramount importance for autonomous driving to prevent potential hazards. This paper emphasizes the critical role of precise speed estimation and presents a novel real-time framework based on deep learning to achieve this from images captured by an onboard camera. The system detects and tracks vehicles using convolutional neural networks and analyzes their trajectories with a tracking algorithm. Vehicle speeds are then accurately estimated using a regression model based on random sample consensus. A synthetic dataset using the CARLA simulator has been generated to validate the presented methodology. The system can simultaneously estimate the speed of multiple vehicles and can be easily integrated into onboard computer systems, providing a cost-effective solution for real-time speed estimation. This technology holds significant potential for enhancing vehicle safety systems, driver assistance, and autonomous driving. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

21 pages, 11115 KiB

Open AccessArticle

HA-Net: A Hybrid Algorithm Model for Underwater Image Color Restoration and Texture Enhancement

by Jin Qian, Hui Li and Bin Zhang

Electronics 2024, 13(13), 2623; https://doi.org/10.3390/electronics13132623 - 4 Jul 2024

Viewed by 461

Abstract

Due to the extremely irregular nonlinear degradation of images obtained in real underwater environments, it is difficult for existing underwater image enhancement methods to stably restore degraded underwater images, thus making it challenging to improve the efficiency of marine work. We propose a [...] Read more.

Due to the extremely irregular nonlinear degradation of images obtained in real underwater environments, it is difficult for existing underwater image enhancement methods to stably restore degraded underwater images, thus making it challenging to improve the efficiency of marine work. We propose a hybrid algorithm model for underwater image color restoration and texture enhancement, termed HA-Net. First, we introduce a dynamic color correction algorithm based on depth estimation to restore degraded images and mitigate color attenuation in underwater images by calculating the depth of targets and backgrounds. Then, we propose a multi-scale U-Net to enhance the network’s feature extraction capability and introduce a parallel attention module to capture image spatial information, thereby improving the model’s accuracy in recognizing deep semantics such as fine texture. Finally, we propose a global information compensation algorithm to enhance the output image’s integrity and boost the network’s learning ability. Experimental results on synthetic standard data sets and real data demonstrate that our method produces images with clear texture and bright colors, outperforming other algorithms in both subjective and objective evaluations, making it more suitable for real marine environments. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

15 pages, 8754 KiB

Open AccessArticle

EFE-CNA Net: An Approach for Effective Image Deblurring Using an Edge-Sensitive Focusing Encoder

by Fengbo Zheng, Xiu Zhang, Lifen Jiang and Gongbo Liang

Electronics 2024, 13(13), 2493; https://doi.org/10.3390/electronics13132493 - 26 Jun 2024

Viewed by 861

Abstract

Deep learning-based image deblurring techniques have made great advancements, improving both processing speed and deblurring efficacy. However, existing methods still face challenges when dealing with complex blur types and the semantic understanding of images. The segment anything model (SAM), a versatile deep learning [...] Read more.

Deep learning-based image deblurring techniques have made great advancements, improving both processing speed and deblurring efficacy. However, existing methods still face challenges when dealing with complex blur types and the semantic understanding of images. The segment anything model (SAM), a versatile deep learning model that accurately and efficiently segments objects in images, facilitates various tasks in computer vision. This article leverages SAM’s proficiency in capturing object edges and enhancing image content comprehension to improve image deblurring. We introduce the edge-sensitive focusing encoder (EFE) module, which utilizes masks generated by the SAM framework and re-weights the masked portion following SAM segmentation by detecting its features and high-frequency information. The EFE module uses the masks to locate the position of the blur in an image while identifying the intensity of the blur, allowing the model to focus more accurately on specific features. Masks with greater high-frequency information are assigned higher weights, prompting the network to prioritize them during processing. Based on the EFE module, we develop a deblurring network called the edge-sensitive focusing encoder-based convolution–normalization and attention network (EFE-CNA Net), which utilizes the EFE module to enhance the deblurring process, employs an image-mask decoder to merge features from both the image and the mask from the EFE module, and incorporates the CNA Net as its base network. This design enables the model to focus on distinct features at various locations, enhancing its learning process through the guidance provided by the EFE module and the blurred images. Testing results on the RealBlur and REDS datasets demonstrate the effectiveness of the EFE-CNA Net, achieving peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) metrics of 28.77, 0.902 (RealBlur-J), 36.40, 0.956 (RealBlur-R), 31.45, and 0.919 (REDS). Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

15 pages, 32016 KiB

Open AccessArticle

A Multiscale Parallel Pedestrian Recognition Algorithm Based on YOLOv5

by Qi Song, ZongHe Zhou, ShuDe Ji, Tong Cui, BuDan Yao and ZeQi Liu

Electronics 2024, 13(10), 1989; https://doi.org/10.3390/electronics13101989 - 20 May 2024

Viewed by 647

Abstract

Mainstream pedestrian recognition algorithms have problems such as low accuracy and insufficient real-time performance. In this study, we developed an improved pedestrian recognition algorithm named YOLO-MSP (multiscale parallel) based on residual network ideas, and we improved the network architecture based on YOLOv5s. Three [...] Read more.

Mainstream pedestrian recognition algorithms have problems such as low accuracy and insufficient real-time performance. In this study, we developed an improved pedestrian recognition algorithm named YOLO-MSP (multiscale parallel) based on residual network ideas, and we improved the network architecture based on YOLOv5s. Three pooling layers were used in parallel in the MSP module to output multiscale features and improve the accuracy of the model while ensuring real-time performance. The Swin Transformer module was also introduced into the network, which improved the efficiency of the model in image processing by avoiding global calculations. The CBAM (Convolutional Block Attention Module) attention mechanism was added to the C3 module, and this new module was named the CBAMC3 module, which improved model efficiency while ensuring the model was lightweight. The WMD-IOU (weighted multidimensional IOU) loss function proposed in this study used the shape change between the recognition frame and the real frame as a parameter to calculate the loss of the recognition frame shape, which could guide the model to better learn the shape and size of the target and optimize recognition performance. Comparative experiments using the INRIA public data set showed that the proposed YOLO-MSP algorithm outperformed state-of-the-art pedestrian recognition methods in accuracy and speed. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

15 pages, 606 KiB

Open AccessArticle

Towards Super Compressed Neural Networks for Object Identification: Quantized Low-Rank Tensor Decomposition with Self-Attention

by Baichen Liu, Dongwei Wang, Qi Lv, Zhi Han and Yandong Tang

Electronics 2024, 13(7), 1330; https://doi.org/10.3390/electronics13071330 - 2 Apr 2024

Viewed by 809

Abstract

Deep convolutional neural networks have a large number of parameters and require a significant number of floating-point operations during computation, which limits their deployment in situations where the storage space is limited and computational resources are insufficient, such as in mobile phones and [...] Read more.

Deep convolutional neural networks have a large number of parameters and require a significant number of floating-point operations during computation, which limits their deployment in situations where the storage space is limited and computational resources are insufficient, such as in mobile phones and small robots. Many network compression methods have been proposed to address the aforementioned issues, including pruning, low-rank decomposition, quantization, etc. However, these methods typically fail to achieve a significant compression ratio in terms of the parameter count. Even when high compression rates are achieved, the network’s performance is often significantly deteriorated, making it difficult to perform tasks effectively. In this study, we propose a more compact representation for neural networks, named Quantized Low-Rank Tensor Decomposition (QLTD), to super compress deep convolutional neural networks. Firstly, we employed low-rank Tucker decomposition to compress the pre-trained weights. Subsequently, to further exploit redundancies within the core tensor and factor matrices obtained through Tucker decomposition, we employed vector quantization to partition and cluster the weights. Simultaneously, we introduced a self-attention module for each core tensor and factor matrix to enhance the training responsiveness in critical regions. The object identification results in the CIFAR10 experiment showed that QLTD achieved a compression ratio of 35.43×, with less than 1% loss in accuracy and a compression ratio of 90.61×, with less than a 2% loss in accuracy. QLTD was able to achieve a significant compression ratio in terms of the parameter count and realize a good balance between compressing parameters and maintaining identification accuracy. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

17 pages, 6098 KiB

Open AccessArticle

MIX-Net: Hybrid Attention/Diversity Network for Person Re-Identification

by Minglang Li, Zhiyong Tao, Sen Lin and Kaihao Feng

Electronics 2024, 13(5), 1001; https://doi.org/10.3390/electronics13051001 - 6 Mar 2024

Viewed by 908

Abstract

Person re-identification (Re-ID) networks are often affected by factors such as pose variations, changes in viewpoint, and occlusion, leading to the extraction of features that encompass a considerable amount of irrelevant information. However, most research has struggled to address the challenge of simultaneously [...] Read more.

Person re-identification (Re-ID) networks are often affected by factors such as pose variations, changes in viewpoint, and occlusion, leading to the extraction of features that encompass a considerable amount of irrelevant information. However, most research has struggled to address the challenge of simultaneously endowing features with both attentive and diversified information. To concurrently extract attentive yet diverse pedestrian features, we amalgamated the strengths of convolutional neural network (CNN) attention and self-attention. By integrating the extracted latent features, we introduced a Hybrid Attention/Diversity Network (MIX-Net), which adeptly captures attentive but diverse information from personal images via a fusion of attention branches and attention suppression branches. Additionally, to extract latent information from secondary important regions to enrich the diversity of features, we designed a novel Discriminative Part Mask (DPM). Experimental results establish the robust competitiveness of our approach, particularly in effectively distinguishing individuals with similar attributes. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

Review

Jump to: Research

23 pages, 3374 KiB

Open AccessReview

A Review: Remote Sensing Image Object Detection Algorithm Based on Deep Learning

by Chenshuai Bai, Xiaofeng Bai and Kaijun Wu

Electronics 2023, 12(24), 4902; https://doi.org/10.3390/electronics12244902 - 6 Dec 2023

Cited by 2 | Viewed by 2918

Abstract

Target detection in optical remote sensing images using deep-learning technologies has a wide range of applications in urban building detection, road extraction, crop monitoring, and forest fire monitoring, which provides strong support for environmental monitoring, urban planning, and agricultural management. This paper reviews [...] Read more.

Target detection in optical remote sensing images using deep-learning technologies has a wide range of applications in urban building detection, road extraction, crop monitoring, and forest fire monitoring, which provides strong support for environmental monitoring, urban planning, and agricultural management. This paper reviews the research progress of the YOLO series, SSD series, candidate region series, and Transformer algorithm. It summarizes the object detection algorithms based on standard improvement methods such as supervision, attention mechanism, and multi-scale. The performance of different algorithms is also compared and analyzed with the common remote sensing image data sets. Finally, future research challenges, improvement directions, and issues of concern are prospected, which provides valuable ideas for subsequent related research. Full article

(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Deep Learning-Based Image Restoration and Object Identification

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (9 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI