Deep Learning-Based Image Restoration and Object Identification

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 January 2025 | Viewed by 9436

Special Issue Editors

Key Laboratory of Manufacturing Industrial Integrated, Shenyang University, Shenyang 110044, China
Interests: image restoration; object tracking and re-identification

E-Mail
Guest Editor
School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, China
Interests: object tracking; action recognition; image restoration

E-Mail Website
Guest Editor
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
Interests: image restoration; object detection and tracking

Special Issue Information

Dear Colleagues,

Image restoration/object identification is one of the challenging tasks of computer vision applications. Image restoration is essential to guarantee the success of subsequent stages of computer vision applications, such as detection and segmentation, since it can recover useful textural and structural information and eliminate the effect of irrelevant information. Object identification is a computer vision technology that deals with recognizing instances of semantic objects (such as humans, buildings, or cars) in images and videos. Object identification has attracted increasing attention in recent years due to its wide range of applications, such as security monitoring, autonomous driving, transportation surveillance, and robotic vision. This Special Issue aims to explore recent advances and trends in the use of deep learning and computer vision methods for image restoration/object identification and seeks original contributions that point out possible ways to deal with image data recovery and identification. This includes but is not limited to deep learning techniques, low-level image processing, image restoration, object recolonization/detection, and person/car re-identification.

Dr. Qiang Wang
Dr. Weihong Ren
Dr. Huijie Fan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image restoration
  • object identification
  • person re-identification
  • object detection and tracking
  • autonomous driving
  • scene understanding
  • transfer learning

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

15 pages, 17295 KiB  
Article
Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification
by Feng Zhou, Zhuxuan Cheng, Haitao Yang, Yifeng Song and Shengpeng Fu
Electronics 2024, 13(14), 2825; https://doi.org/10.3390/electronics13142825 - 18 Jul 2024
Viewed by 349
Abstract
The visible-infrared person re-identification (VI-ReID) task aims to retrieve the same pedestrian between visible and infrared images. VI-ReID is a challenging task due to the huge modality discrepancy and complex intra-modality variations. Existing works mainly complete the modality alignment at one stage. However, [...] Read more.
The visible-infrared person re-identification (VI-ReID) task aims to retrieve the same pedestrian between visible and infrared images. VI-ReID is a challenging task due to the huge modality discrepancy and complex intra-modality variations. Existing works mainly complete the modality alignment at one stage. However, aligning modalities at different stages has positive effects on the intra-class and inter-class distances of cross-modality features, which are often ignored. Moreover, discriminative features with identity information may be corrupted in the processing of modality alignment, further degrading the performance of person re-identification. In this paper, we propose a progressive discriminative feature learning (PDFL) network that adopts different alignment strategies at different stages to alleviate the discrepancy and learn discriminative features progressively. Specifically, we first design an adaptive cross fusion module (ACFM) to learn the identity-relevant features via modality alignment with channel-level attention. For well preserving identity information, we propose a dual-attention-guided instance normalization module (DINM), which can well guide instance normalization to align two modalities into a unified feature space through channel and spatial information embedding. Finally, we generate multiple part features of a person to mine subtle differences. Multi-loss optimization is imposed during the training process for more effective learning supervision. Extensive experiments on the public datasets of SYSU-MM01 and RegDB validate that our proposed method performs favorably against most state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

28 pages, 67696 KiB  
Article
PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network
by Wentao Li, Guang Zhou, Sen Lin and Yandong Tang
Electronics 2024, 13(14), 2817; https://doi.org/10.3390/electronics13142817 - 17 Jul 2024
Viewed by 414
Abstract
The existing image-restoration methods are only effective for specific degradation tasks, but the type of image degradation in practical applications is unknown, and mismatch between the model and the actual degradation will lead to performance decline. Attention mechanisms play an important role in [...] Read more.
The existing image-restoration methods are only effective for specific degradation tasks, but the type of image degradation in practical applications is unknown, and mismatch between the model and the actual degradation will lead to performance decline. Attention mechanisms play an important role in image-restoration tasks; however, it is difficult for existing attention mechanisms to effectively utilize the continuous correlation information of image noise. In order to solve these problems, we propose a Progressive and Efficient All-in-one Image Restoration Lightweight Network (PerNet). The network consists of a Plug-and-Play Efficient Local Attention Module (PPELAM). The PPELAM is composed of multiple Efficient Local Attention Units (ELAUs) and PPELAM can effectively use the global information and horizontal and vertical correlation of image degradation features in space, so as to reduce information loss and have a small number of parameters. PerNet is able to learn the degradation properties of images very well, which allows us to reach an advanced level in image-restoration tasks. Experiments show that PerNet has excellent results for typical restoration tasks (image deraining, image dehazing, image desnowing and underwater image enhancement), and the excellent performance of ELAU combined with Transformer in the ablation experiment chapter further proves the high efficiency of ELAU. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

15 pages, 5604 KiB  
Article
Real-Time Deep Learning Framework for Accurate Speed Estimation of Surrounding Vehicles in Autonomous Driving
by Iván García-Aguilar, Jorge García-González, Enrique Domínguez, Ezequiel López-Rubio and Rafael M. Luque-Baena
Electronics 2024, 13(14), 2790; https://doi.org/10.3390/electronics13142790 - 16 Jul 2024
Viewed by 451
Abstract
Accurate speed estimation of surrounding vehicles is of paramount importance for autonomous driving to prevent potential hazards. This paper emphasizes the critical role of precise speed estimation and presents a novel real-time framework based on deep learning to achieve this from images captured [...] Read more.
Accurate speed estimation of surrounding vehicles is of paramount importance for autonomous driving to prevent potential hazards. This paper emphasizes the critical role of precise speed estimation and presents a novel real-time framework based on deep learning to achieve this from images captured by an onboard camera. The system detects and tracks vehicles using convolutional neural networks and analyzes their trajectories with a tracking algorithm. Vehicle speeds are then accurately estimated using a regression model based on random sample consensus. A synthetic dataset using the CARLA simulator has been generated to validate the presented methodology. The system can simultaneously estimate the speed of multiple vehicles and can be easily integrated into onboard computer systems, providing a cost-effective solution for real-time speed estimation. This technology holds significant potential for enhancing vehicle safety systems, driver assistance, and autonomous driving. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

21 pages, 11115 KiB  
Article
HA-Net: A Hybrid Algorithm Model for Underwater Image Color Restoration and Texture Enhancement
by Jin Qian, Hui Li and Bin Zhang
Electronics 2024, 13(13), 2623; https://doi.org/10.3390/electronics13132623 - 4 Jul 2024
Viewed by 461
Abstract
Due to the extremely irregular nonlinear degradation of images obtained in real underwater environments, it is difficult for existing underwater image enhancement methods to stably restore degraded underwater images, thus making it challenging to improve the efficiency of marine work. We propose a [...] Read more.
Due to the extremely irregular nonlinear degradation of images obtained in real underwater environments, it is difficult for existing underwater image enhancement methods to stably restore degraded underwater images, thus making it challenging to improve the efficiency of marine work. We propose a hybrid algorithm model for underwater image color restoration and texture enhancement, termed HA-Net. First, we introduce a dynamic color correction algorithm based on depth estimation to restore degraded images and mitigate color attenuation in underwater images by calculating the depth of targets and backgrounds. Then, we propose a multi-scale U-Net to enhance the network’s feature extraction capability and introduce a parallel attention module to capture image spatial information, thereby improving the model’s accuracy in recognizing deep semantics such as fine texture. Finally, we propose a global information compensation algorithm to enhance the output image’s integrity and boost the network’s learning ability. Experimental results on synthetic standard data sets and real data demonstrate that our method produces images with clear texture and bright colors, outperforming other algorithms in both subjective and objective evaluations, making it more suitable for real marine environments. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

15 pages, 8754 KiB  
Article
EFE-CNA Net: An Approach for Effective Image Deblurring Using an Edge-Sensitive Focusing Encoder
by Fengbo Zheng, Xiu Zhang, Lifen Jiang and Gongbo Liang
Electronics 2024, 13(13), 2493; https://doi.org/10.3390/electronics13132493 - 26 Jun 2024
Viewed by 861
Abstract
Deep learning-based image deblurring techniques have made great advancements, improving both processing speed and deblurring efficacy. However, existing methods still face challenges when dealing with complex blur types and the semantic understanding of images. The segment anything model (SAM), a versatile deep learning [...] Read more.
Deep learning-based image deblurring techniques have made great advancements, improving both processing speed and deblurring efficacy. However, existing methods still face challenges when dealing with complex blur types and the semantic understanding of images. The segment anything model (SAM), a versatile deep learning model that accurately and efficiently segments objects in images, facilitates various tasks in computer vision. This article leverages SAM’s proficiency in capturing object edges and enhancing image content comprehension to improve image deblurring. We introduce the edge-sensitive focusing encoder (EFE) module, which utilizes masks generated by the SAM framework and re-weights the masked portion following SAM segmentation by detecting its features and high-frequency information. The EFE module uses the masks to locate the position of the blur in an image while identifying the intensity of the blur, allowing the model to focus more accurately on specific features. Masks with greater high-frequency information are assigned higher weights, prompting the network to prioritize them during processing. Based on the EFE module, we develop a deblurring network called the edge-sensitive focusing encoder-based convolution–normalization and attention network (EFE-CNA Net), which utilizes the EFE module to enhance the deblurring process, employs an image-mask decoder to merge features from both the image and the mask from the EFE module, and incorporates the CNA Net as its base network. This design enables the model to focus on distinct features at various locations, enhancing its learning process through the guidance provided by the EFE module and the blurred images. Testing results on the RealBlur and REDS datasets demonstrate the effectiveness of the EFE-CNA Net, achieving peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) metrics of 28.77, 0.902 (RealBlur-J), 36.40, 0.956 (RealBlur-R), 31.45, and 0.919 (REDS). Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

15 pages, 32016 KiB  
Article
A Multiscale Parallel Pedestrian Recognition Algorithm Based on YOLOv5
by Qi Song, ZongHe Zhou, ShuDe Ji, Tong Cui, BuDan Yao and ZeQi Liu
Electronics 2024, 13(10), 1989; https://doi.org/10.3390/electronics13101989 - 20 May 2024
Viewed by 647
Abstract
Mainstream pedestrian recognition algorithms have problems such as low accuracy and insufficient real-time performance. In this study, we developed an improved pedestrian recognition algorithm named YOLO-MSP (multiscale parallel) based on residual network ideas, and we improved the network architecture based on YOLOv5s. Three [...] Read more.
Mainstream pedestrian recognition algorithms have problems such as low accuracy and insufficient real-time performance. In this study, we developed an improved pedestrian recognition algorithm named YOLO-MSP (multiscale parallel) based on residual network ideas, and we improved the network architecture based on YOLOv5s. Three pooling layers were used in parallel in the MSP module to output multiscale features and improve the accuracy of the model while ensuring real-time performance. The Swin Transformer module was also introduced into the network, which improved the efficiency of the model in image processing by avoiding global calculations. The CBAM (Convolutional Block Attention Module) attention mechanism was added to the C3 module, and this new module was named the CBAMC3 module, which improved model efficiency while ensuring the model was lightweight. The WMD-IOU (weighted multidimensional IOU) loss function proposed in this study used the shape change between the recognition frame and the real frame as a parameter to calculate the loss of the recognition frame shape, which could guide the model to better learn the shape and size of the target and optimize recognition performance. Comparative experiments using the INRIA public data set showed that the proposed YOLO-MSP algorithm outperformed state-of-the-art pedestrian recognition methods in accuracy and speed. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

15 pages, 606 KiB  
Article
Towards Super Compressed Neural Networks for Object Identification: Quantized Low-Rank Tensor Decomposition with Self-Attention
by Baichen Liu, Dongwei Wang, Qi Lv, Zhi Han and Yandong Tang
Electronics 2024, 13(7), 1330; https://doi.org/10.3390/electronics13071330 - 2 Apr 2024
Viewed by 809
Abstract
Deep convolutional neural networks have a large number of parameters and require a significant number of floating-point operations during computation, which limits their deployment in situations where the storage space is limited and computational resources are insufficient, such as in mobile phones and [...] Read more.
Deep convolutional neural networks have a large number of parameters and require a significant number of floating-point operations during computation, which limits their deployment in situations where the storage space is limited and computational resources are insufficient, such as in mobile phones and small robots. Many network compression methods have been proposed to address the aforementioned issues, including pruning, low-rank decomposition, quantization, etc. However, these methods typically fail to achieve a significant compression ratio in terms of the parameter count. Even when high compression rates are achieved, the network’s performance is often significantly deteriorated, making it difficult to perform tasks effectively. In this study, we propose a more compact representation for neural networks, named Quantized Low-Rank Tensor Decomposition (QLTD), to super compress deep convolutional neural networks. Firstly, we employed low-rank Tucker decomposition to compress the pre-trained weights. Subsequently, to further exploit redundancies within the core tensor and factor matrices obtained through Tucker decomposition, we employed vector quantization to partition and cluster the weights. Simultaneously, we introduced a self-attention module for each core tensor and factor matrix to enhance the training responsiveness in critical regions. The object identification results in the CIFAR10 experiment showed that QLTD achieved a compression ratio of 35.43×, with less than 1% loss in accuracy and a compression ratio of 90.61×, with less than a 2% loss in accuracy. QLTD was able to achieve a significant compression ratio in terms of the parameter count and realize a good balance between compressing parameters and maintaining identification accuracy. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

17 pages, 6098 KiB  
Article
MIX-Net: Hybrid Attention/Diversity Network for Person Re-Identification
by Minglang Li, Zhiyong Tao, Sen Lin and Kaihao Feng
Electronics 2024, 13(5), 1001; https://doi.org/10.3390/electronics13051001 - 6 Mar 2024
Viewed by 908
Abstract
Person re-identification (Re-ID) networks are often affected by factors such as pose variations, changes in viewpoint, and occlusion, leading to the extraction of features that encompass a considerable amount of irrelevant information. However, most research has struggled to address the challenge of simultaneously [...] Read more.
Person re-identification (Re-ID) networks are often affected by factors such as pose variations, changes in viewpoint, and occlusion, leading to the extraction of features that encompass a considerable amount of irrelevant information. However, most research has struggled to address the challenge of simultaneously endowing features with both attentive and diversified information. To concurrently extract attentive yet diverse pedestrian features, we amalgamated the strengths of convolutional neural network (CNN) attention and self-attention. By integrating the extracted latent features, we introduced a Hybrid Attention/Diversity Network (MIX-Net), which adeptly captures attentive but diverse information from personal images via a fusion of attention branches and attention suppression branches. Additionally, to extract latent information from secondary important regions to enrich the diversity of features, we designed a novel Discriminative Part Mask (DPM). Experimental results establish the robust competitiveness of our approach, particularly in effectively distinguishing individuals with similar attributes. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

Review

Jump to: Research

23 pages, 3374 KiB  
Review
A Review: Remote Sensing Image Object Detection Algorithm Based on Deep Learning
by Chenshuai Bai, Xiaofeng Bai and Kaijun Wu
Electronics 2023, 12(24), 4902; https://doi.org/10.3390/electronics12244902 - 6 Dec 2023
Cited by 2 | Viewed by 2918
Abstract
Target detection in optical remote sensing images using deep-learning technologies has a wide range of applications in urban building detection, road extraction, crop monitoring, and forest fire monitoring, which provides strong support for environmental monitoring, urban planning, and agricultural management. This paper reviews [...] Read more.
Target detection in optical remote sensing images using deep-learning technologies has a wide range of applications in urban building detection, road extraction, crop monitoring, and forest fire monitoring, which provides strong support for environmental monitoring, urban planning, and agricultural management. This paper reviews the research progress of the YOLO series, SSD series, candidate region series, and Transformer algorithm. It summarizes the object detection algorithms based on standard improvement methods such as supervision, attention mechanism, and multi-scale. The performance of different algorithms is also compared and analyzed with the common remote sensing image data sets. Finally, future research challenges, improvement directions, and issues of concern are prospected, which provides valuable ideas for subsequent related research. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

Back to TopTop